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IDENTIFICATION AND CHARACTERIZATION OF INTERACTING MOLECULES 



The present invention relates to an improved method for the 
identification and optionally the characterization of 
interacting molecules designed to detect positive clones from 
the rather large numbers of false positive clones isolated by 
two-hybrid systems. The method of the invention relies on a 
novel combination of selection steps used to detect clones 
that express interacting molecules from false positive clones. 
The present invention further relates to a kit useful for 
carrying out the method of the invention. The present 
invention provides for parallel, high- throughput or automated 
interaction screens for the reliable identification of 
interacting molecules. 

Protein-protein interactions are essential for nearly all 
biological processes like replication, transcription, 
secretion, signal transduction and metabolism. Classical 
methods for identifying such interactions like co- 
immunoprecipitation or cross-linking are not available for all 
proteins or may not be sufficiently sensitive. Said methods 
further have the disadvantage that only by a great deal of 
energy, potentially interacting partners and corresponding 
nucleic acid fragments or sequences may be identified. 
Usually, this is effected by protein sequencing or production 
of antibodies, followed by the screening of an expression- 
library . 

An important development for the convenient identification of 
protein-protein interactions was the yeast two-hybrid (2H) 
system presented by Fields and Song (1989) .This genetic 
procedure not only allows the rapid demonstration of in vivo 
interactions, but also the simple isolation of corresponding 
nucleic acid sequences encoding for the interacting partners. 
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The yeast two-hybrid system makes use of the features of a 
wide variety of eukaryotic transcription factors which carry 
two separable functional domains: one DNA binding domain as 
well as a second domain which activates the RNA-polymerase 
complex (activation domain) . In the classical 2H system a so- 
called "bait" protein comprising of a DNA binding domain 
(GAL4bd or lex A) and a protein of interest "X" are expressed 
as a fusion protein in yeast. The same yeast cell also 
simultaneously expresses a so called "fish" protein comprising 
of an activation domain (GAL4ad or VP16) and a protein "Y" . 
Upon the interaction of a bait protein with a fish protein, 
the DNA binding and activation domains of the fusion proteins 
are brought into close proximity and the resulting protein 
complex triggers the expression of the reporter genes, for 
example, HIS3 or lacZ. Said expression can be easily monitored 
by cultivation of the yeast cells on selective medium without 
histidine as well as upon the activation of the lacZ gene. The 
genetic sequence encoding, for example, an unknown fish 
protein, may easily be identified by isolating the 
corresponding plasmid and subsequent sequence analysis. 
Meanwhile, a number of variants of the 2H system have been 
developed. The most important of those are the "one hybrid" 
system for the identification of promoter binding proteins and 
the "tri-hybrid" system for the identification of RNA-protein- 
interactions (Li and Herskowitz, 1993/ SenGupta et al., 1996; 
Putz et al., 1996). It is understood in the art that to 
identify, detect or assay the variety of interactions found in 
biological systems, different 2H systems must be employed. 
Indeed, other 2H technologies have been developed to enable 
protein-protein interactions to be investigated in other 
organisms and/or different cell compartments. For example, in 
mammalin cells (Rossi et al, 1997; PNAS 94:8405-8410), in 
bacterial cells (Karimova et al., 1998; PNAS 95:5752-5756), in 
the cytoplasm of yeast cells (Johnsson & Varshavsky; 1996 
US5503977) and in the periplasmic space of yeast cells 
(Fowlkes et al., 1998; US 5789184). 
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These 2H systems for the identification of protein-protein- 
interaction, have, until today, only been carried out on a 
laboratory scale. The various steps of these systems need to 
be conducted serially. They are, therefore, quite time 
consuming. As a consequence, these 2H systems have so far 
proven unsuitable for the analysis of eukaryotic library vs 
library screens to investigate protein-protein networks. 
Although recent developments have taken into account these 
disadvantages (Bartel et al.,1996), a successful large scale 
search of interacting proteins, for example on the basis of a 
eukaryotic library vs. library screen, has not been -reported. 
More importantly also, 2H systems suffer from the serious 
drawback that many false -positive clones not representing any 
interactions between binding partners are isolated. This is 
particularly inconvenient in cases where large numbers of 
clones are to be analyzed because in the case of a eukaryotic 
library vs library screen it is typical that several hundreds 
of thousands of clones have to be analyzed for the 
investigation of protein-protein networks. 

The technical problem underlying the present invention was 
therefore to overcome these prior art difficulties and to 
furnish a system that reliably produces clones that express 
interacting molecules. This system should, moreover, be 
suitable for large-scale library vs library screens using a 
parallel, high- throughput or automated approach. 

The solution to said technical problem is achieved by 
providing the embodiments characterized in the claims. 

Accordingly, the present invention relates to a method for the 
identification of at least one member of a pair or complex of 
interacting molecules, comprising: 

(a) providing host cells containing at least two genetic 

elements with different selectable and counterselectable 
markers, said genetic elements each comprising genetic 
information specifying one of said members, said host 
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cells further carrying a readout system that is activated 
upon the interaction of said molecules; 

(b) allowing at least one interaction, if any, to occur; 

(c) selecting for said interaction by transferring progeny of 
said host cells to 

(ca) at least two different selective media, wherein each of 
said selective media allows growth of said host cells 
only in the absence of at least one of said 
counterselectable markers and in the presence of a 
selectable marker; and 

(cb) a further selective medium that allows identification of 
said host cells only on the activation of said readout 
system ; 

(d) identifying host cells containing interacting molecules 
that 

(da) do not activate said readout system on any of said 
selective media specified in (ca) ; and 

(db) activate the readout system on said selective medium 
specified in (cb) ; and 

(e) identifying at least one member of said pair or complex 
of interacting molecules. 

Preferably, said interaction is a specific interaction. 

The terms "identification" and "identifying", as used in 
accordance with the present invention, relate to the ability 
of the person skilled in the art to detect positive clones 
that express interacting molecules from false positive clones 
due to the activation of the readout system on the selective 
media and optionally additionally to characterize at least one 
of said interacting molecules by one or a set of unambiguous 
features. Preferably, said molecules are characterized by the 
DNA sequence encoding them, upon nucleic acid hybridization or 
isolation and sequencing of the respective DNA molecules. 
Alternatively and less preferred, said molecules may be 
characterized by different features such as molecular weight, 
isoelectric point and, in the case of proteins, the N- terminal 
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amino acid sequence etc. Methods for determining such 
parameters are well known in the art. 

Preferably, said members specified by said genetic elements 
are connected to a further entity that will upon the 
interaction activate or contribute to the activation of said 
read out system. It is further preferred that said entity is 
conserved for each type of genetic element and that different 
types of genetic elements comprise different entities. It is 
additionally preferred that said member of said pair or 
complex of interacting molecules forms, when transcribed as 
RNA from said genetic element, an RNA transcript fused with 
RNA specifying said entity. Most preferably, said fused RNA 
transcript is translated to form a fusion protein comprising 
said member fused to said entity. As will be elaborated 
further herein below, said entity may be in one type of 
genetic element a DNA sequence encoding a DNA-binding domain 
and in a different type of genetic element a transactivating 
protein domain. Preferably, said genetic elements are vectors 
such as plasmids. Alternatively, interaction between two 
fusion proteins may result in a functional entity with 
reconstituted enzymatic activity, for example the bacterial 
chloramphenicol acetyltransf erase protein (CAT) (Seed & Sheen, 
1988 Gene 67:271-277). The at least two genetic elements 
comprised in said host cell are preferentially vectors from a 
library such as a cDNA or genomic library. Thus, the method of 
the invention allows the screening of a variety of host cells 
wherein the vector portion of said genetic elements is 
preferably the same for each type of genetic element whereas 
the potentially interacting molecules are representatives of a 
library and, thus, as a rule and in case that the library has 
not been amplified, may differ in each host cell. In this 
connection the term "type of genetic element" refers to an 
element characterized by comprising the same entity, 
selectable and counterselectable markers. 
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Preferably, the "interaction" of said molecules is specific 
and characterized by a high binding constant. However, the 
term "interaction" may also refer to a binding between 
molecules with a lower binding constant which, however, must 
be sufficient to activate the readout system. The interaction 
that is detectable by the method of the invention preferably 
leads to the formation of a functional entity having a 
biological, physical or chemical activity which was not 
present in said host cell before said interaction occurred. 

Said interaction may lead to the formation of a functional 
transcriptional activator comprising a DNA-binding and a 
transactivating protein domain and which is capable of 
activating a responsive moiety that drives the activation of 
said readout system. For example, said moiety may be a 
promoter. 

Alternatively, said interaction may lead to a detectable 
fluorescence resonance energy transfer obtained by the 
interaction of fusion proteins containing, for example, the 
GFP type a and GFP type b fluorescent proteins (Cubbitt et 
al., 1995; Heim & Tsien, Curr Biol. 1996 6:178-182). Said 
interaction may also alternatively lead to the reconstitution 
of a functional enzyme, for example fi-galacotsidase (Rossi et 
al., 1997) or adenylate cyclase (Karimova et al., 1998). These 
embodiments will be preferred for the study of interactions in 
host-cell types other than yeast. 

In a further embodiment, said interaction may lead to a 
detectable modification of a substrate by an enzyme such as a 
color reaction obtained by the cleavage of a propeptide by an 
enzyme. In all these embodiments of the invention, it is 
understood that the interacting molecules are preferably 
directly fused to the molecules driving the readout system. 

The term "growth" on selective media "in the absence of at 
least one of said counter- selectable markers" refers to the 
fact that a population of host cells containing at least one 
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of genetic elements is placed on said selective media but only 
those progeny of the host cells in the overall population that 
have lost the relevant genetic element are able to grow. For 
example, when a yeast strain which is resistant to the drug 
canavanine (can r ) and which also contains a plasmid carrying 
the wild- type CAN1 gene (Hoffmann, 1985) is placed on a 
selective medium containing canavanine, only those progeny of 
the yeast strain that have lost the plasmid carrying the CAN1 
gene are able to grow, because this gene confers sensitivity 
to canavanine in yeast cells. 

With reference to step (ca) , it should be noted that each of 
the at least two selective media would comprise at least one 
counterselectable compound such as cycloheximide wherein the 
counterselectable compound would be different in the different 
selective media; they would further typically lack a compound 
complementing for an auxotrophic marker or comprise an 
antibiotic. The compound or antibiotic may be the same for the 
various selective media. Preferably, at least one is 
different . 

The method of the present invention provides a highly 
effective tool for selecting against false positive clones 
that have proven to* dramatically reduce the overall usefulness 
of the two-hybrid system. For example, by inclusion of a 
marker counterselecting for the absence of a genetic element 
that specifies one of a pair of the potentially interacting 
partners, clones that will grow and therefore only carry the 
second genetic element specifying the second partner can now 
be tested for the activation of the readout system. If the 
clone containing only the fusion protein encoded by the second 
genetic element activates the readout system in the absence of 
the other genetic element, then it will be classified as a 
false positive. By counterselecting for the absence of the 
second genetic element, the same test is applied to the first 
genetic element. Thus, only clones that activate the readout 
system in the presence of both or all genetic elements, but do 
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not activate the read out system when either of the genetic 
elements is lost are classified as positives. 

The advantages associated with the method of the invention 
have a significant impact in particular on the number of 
clones that express potentially interacting partners that can 
conveniently be analyzed. For example, even work on the 
laboratory scale will be more effective since positive clones 
that express interacting partners can be easily and 
unambiguously discriminated from false positive clones without 
the generation of additional strains. In contrast, to detect 
false positive clones using the state of the art yeast two- 
hybrid system, plasmids that encode fish proteins usually need 
to be isolated and retransf ormed into yeast cells harboring 
plasmids that encode unrelated bait proteins. Further, the 
enormous number of false positive clones that would be 
isolated when using the classical two-hybrid system on a large 
scale, yet are discriminated by the method of this invention 
no longer precludes an effective high through-put analysis of 
clones. In the long run, it is expected that the method of the 
present invention is especially advantageous for a high 
throughput analysis of a large number of yeast clones 
containing interacting molecules since many specific 
interactions and the individual members of these interactions 
can be identified in a parallel and automated approach. 

Some investigators have noted the problem of identifying false 
positive clones when applying the yeast two-hybrid system in 
the past. Bartel et al. (1996) described a method for the 
elimination of false positives by replica plating clones that 
express one fusion protein from SD-leu and SD-trp plates, to 
SD-his plates. Clones that showed growth on the SD-his plates 
where identified as false positives and were subsequently not 
used for interaction mating. The disadvantage of this method 
is that the procedure is labor intensive because yeast strains 
expressing the fish proteins, the bait proteins and the 
potentially interacting fish and bait proteins all must be 
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generated and analyzed. The use of the counterselectable 
system described in this invention has the advantage that only 
one strain which expresses the potentially interacting fusion 
proteins is generated and must be. analyzed. 

Other strategies have been proposed to eliminate false 
positive clones from 2H systems (Vidal et al., 1996a; 
Nandabalan et al., 1997). However, these systems all require 
that the readout system that is assayed for activity comprises 
at least one reporter gene that is transcribed on 
reconstitution of DNA binding and transactivating fusion 
proteins. Indeed, although mostly claiming to be applicable to 
all types of cells, these systems have been designed towards 
the specific biological properties of the yeast two-hybrid 
system. The method if invention described herein is not 
limited to eliminating false positive clones expressing single 
DNA binding or activation domain fusion proteins that can 
activate the reporter system. On the contrary, it can be uses 
to eliminate false positive clones in 2H systems other than 
yeast two-hybrid, which is of advantage when interaction 
screens are conducted in for example, other host-cell types. 

A schematic overview of one embodiment of the method of the 
invention is provided in Figure 6. For the parallel analysis 
of a network of protein-protein interactions with the method 
of the invention, a library of plasmid constructs that express 
DNA binding domain and activation domain fusion proteins is 
provided. These libraries may consist of specific DNA 
fragments or a multitude of unknown DNA fragments ligated into 
the improved binding domain and activating domain plasmids of 
the invention containing different selectable and 
counterselectable markers. Both libraries are combined within 
yeast cells by transformation or interaction mating, and yeast 
strains that express potentially interacting proteins are 
selected on selective medium lacking histidine. The selective 
markers TRP1 and LEU2 maintain the plasmids in yeast strains 
grown on selective media, whereas CAN1 and CYH2 specify the 
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counter- selectable markers that select for the loss of each 
plasmid. HIS3 and lacZ represent selectable markers integrated 
into the yeast genome, which are expressed on activation by 
interacting fusion proteins. 

The readout system is, in the present case, both growth on 
medium lacking histidine and enzymatic activity of £- 
galactosidase which can be subsequently screened. It is to be 
understood, however, that the readout system may rely on only 
one marker such as HIS3. Yet, the combination of two 
components that constitute the readout system in many cases 
allows a more ready interpretation of results, in particular 
if one of the components, when activated, effects a change in 
color. A colony picking robot is used to pick the resulting 
yeast colonies into individual wells of 384-well microtiter 
plates containing selective medium lacking histidine, and the 
resulting plates are incubated at 30°C to allow cell growth. 
The interaction library contained in microtiter plates can be 
optionally replicated and stored. The resulting interaction 
library is investigated to detect positive clones that express 
interacting proteins and discriminate them from false positive 
clones using the method of the invention. Using a spotting 
robot, cells are transferred to replica membranes which are 
subsequently placed onto one each of the selective media SD- 
leu-trp-his, SD-leu+CAN and SD-trp+CHX. After incubation on 
the selective plates, the clones grown on the membranes are 
subjected to a fi-Gal assay and a digital image from each 
membrane is obtained with a CCD camera which is then stored on 
computer. Using digital image processing and analysis clones 
that express interacting fusion proteins can be identified by 
considering the pattern of IB-Gal activity from clones grown on 
the various selective media. The individual members comprising 
interactions can then be identified by one or more techniques, 
including PCR, sequencing, hybridization, oligof ingerprinting 
or antibody reactions. An actual experiment carried out along 
the schematic route presented in Figure 6 is shown in Figure 5 
to Figure 22 . 
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The genetic elements specified here and above may further and 
advantageously be equipped with at least two different 
selection markers functional in bacteria such as E.coli. Such 
selection markers, for example aphA (Pansegrau et al., 1987) 
or bla allow the easy separation of said genetic elements upon 
retransf ormation into E.coli strains. 

In a preferred embodiment of the method of the present 
invention said pair or complex of interacting molecules is 
selected from the group consisting of RNA-RNA, RNA-DNA, RNA- 
protein, DNA-DNA, DNA-protein, protein-protein, protein- 
peptide, or peptide-peptide interactions. 

Accordingly, the method of the invention is applicable in a 
wide range of biological interactions. For example, the 
invention will be useful in identifying peptide-protein or 
peptide-peptide- interactions by employing synthetic peptide 
libraries (Yang et al., 1995). 

Two applications of interests are the application of a large 
scale two-hybrid system for the detection of protein-protein 
interactions involved in medically relevant pathways which may 
be useful as diagnostic or therapeutic targets for the 
treatment of disease, and a large scale tri -hybrid system 
which is one example of said complex of interacting molecules 
mentioned herein above for the identification of, for example, 
novel post-transciptional regulators and their binding sites 
(SenGupta et al., 1996; Putz et al., 1996). In this regard it 
should be noted that a complex, in accordance with the 
invention may comprise more than three interacting molecules. 
Furthermore, such a complex may be composed of biologically or 
chemically different members. For example, to identify 
interacting RNA binding proteins and RNA molecules, a plasmid 
expressing a LexA-HIV-lRev protein, a plasmid transcribing an 
RNA sequence in fusion with the Rev responsive element and a 
plasmid expressing a potentially RNA- interacting protein in 
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fusion with an activation domain may be present in one cell. 
The plasmids encoding the RNA fusion molecule and the 
activation domain fusion protein must contain different 
selectable and counterselectable markers according to the 
method of the invention. If the RNA fusion molecule interacts 
with the respective two fusion proteins, the readout system is 
activated. To test whether the RNA fusion molecule or the 
activation domain fusion protein interact, the method of the 
invention is used to investigate the activation of the readout 
system in the absence of either of these fusion molecules. 

In a further preferred embodiment, said genetic elements are 
plasmids, artificial chromosomes, viruses or other 
extrachromosomal elements. 

Whereas it is preferred, due to the easy handling, to employ 
plasmids that specify the genetic elements in accordance with 
the present invention, the persons skilled in the art will be 
able to devise other systems that carry said genetic elements 
and that are identified above. 

In an additional preferred embodiment, said readout system is 
a detectable protein. A number of readout systems are known in 
the art and may, if necessary, be adapted to be useful in the 
method of the invention. 

Most preferably, said detectable protein is that encoded by 
the gene lacZ, HIS3, URA3 , LYS2, sacB or HPRT, respectively. 
As is well known in the art, the expression of the S-gal 
enzyme in yeast can be used for the formation of a detectable 
blue colony after incubation in X-Gal solution. Of course, the 
method of the invention is not restricted for use of only one 
readout system. On the contrary, if desired, a number of such 
readout systems may be combined. Said combination of a number 
of readout systems is, in accordance with the present 
invention, also comprised by the term "readout system" . Such a 
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combination will provide an additional safe guard for the 
identification of clones containing interacting partners. 

Although the two-hybrid system has been developed in yeast, 
the method of the invention can be carried out in a variety of 
host systems. Preferred of those are yeast cells, bacterial 
cells (Karimova et al . , 1998), mammalian cells (Wu et al. 
1996, Rossi et al., 1997), insect cells or plant cells. 
Preferably, the bacterial cells are E. coli cells. 
Of course, the genetic elements may be engineered and prepared 
in one host organism and then, e.g. by employing shuttle 
vectors, be transferred to a different host organism where it 
is employed in the method of the invention. 

In another preferred embodiment, the method of the present 
invention comprises transforming or transfecting said host 
cell with at least one of said genetic elements prior to step 
(a) . 

Whereas the person skilled in the art may initiate the 
identification method of the invention starting from fully 
transformed or transfected host cells, he may wish to first 
generate such host cells in accordance with the aim of his 
research or commercial interest. For example, he may wish to 
generate a certain type of library first that he intends to 
screen against a second library already present in said host 
cells. Alternatively, he may have in mind to generate two or 
more different libraries that he wants to screen against each 
other. In this case, he would need to first transform said 
host cells, simultaneously or successively, with both or all 
types of genetic elements. 

In another preferred embodiment, said host cells with said 
genetic elements are generated by cell fusion, conjugation or 
interaction mating. 
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The biological principal of counter- selection referred to 
above is well known in the art. Accordingly, the person 
skilled in the art may chose from a variety of such counter- 
selectable markers. Preferably, said markers are CAN1, CYH2, 
LYS2, URA3, HRPT or sacB. 

It is further preferred in accordance with the present 
invention that said selectable markers are auxotrophic or 
antibiotic markers. 

It is important to note that some of the markers that are used 
as a readout system, may also be used as selectable markers. 
It is further important to note that one and the same marker 
can not be used as selectable marker and as part of the 
readout system at the same time. 

Most preferably, said auxotrophic or antibiotic markers are 
selected from LEU2, TRP1, URA3 , HIS3, ADE2 , LYS2 and Zeocin. 

Planning of experiments may require that the test for 
interaction need not be done immediately after the provision 
of host cells and, possibly, the occurrence of the 
interactions. In such cases, the researcher may wish to store 
the transformed host cells for further use. Accordingly, a 
further preferred embodiment of the invention relates to a 
method wherein progeny of host cells obtained in step (b) are 
transferred to a storage compartment. 

In particular in cases where a large number of clones is to be 
analyzed, said transfer is advantageously effected or assisted 
by automation or a picking robot. Naturally, other automation 
or robot systems that reliably pick progeny of said host cells 
into predetermined arrays in the storage compartments may also 
be employed. 

The host cells will, in this embodiment, be propagated in said 
storage compartment and provide further progeny for the 
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additional tests. Preferably, replicas of said storage 
compartment maintaining the array of clones are set up. Said 
storage compartments comprising the transformed host cells and 
the appropriate media may be maintained in accordance with 
conventional cultivation protocols. Alternatively, said 
storage compartments may comprise an anti-freeze agent and 
therefore be appropriate for storage in a deep-freezer. This 
embodiment is particularly useful when the evaluation of 
potential interacting partners is to be postponed. As is well 
known in the art, frozen host cells may easily be recovered 
upon thawing and further tested in accordance with the 
invention. Most preferably, said anti- freeze agent is glycerol 
which is preferably present in said media in an amount of 3 - 
25% (vol/vol) . 

In a further particularly preferred embodiment of the method 
of the invention, said storage compartment is a microtiter 
plate. Most preferably, said microtiter plate comprises 384 
wells. Microtiter plates have the particular advantage of 
providing a pre- fixed array that allows the easy replicating 
of clones and furthermore the unambiguous identification and 
assignment of clones throughout the various steps of the 
experiment. The 3 84 well microtiter plate is, due to its 
comparatively small size and large number of compartments, 
particularly suitable for experiments where large numbers of 
clones need to be screened. 

Depending on the design of the experiment, the host cells may 
be grown in the storage compartment such as the above 
microtiter plate to logarithmic or stationary phase. Growth 
conditions may be established by the person skilled in the art 
according to conventional procedures. Cell growth is usually 
performed between 15 and 45 degrees Celsius. 

Transfer of said host cells in step (c) is made or assisted by 
automation, by using a spotting robot or by using a pipetting 
or micropipetting device. How such a spotting robot may be 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



16 

devised and equipped is, for example, described in Lehrach et 
al. (1997) . Naturally, other automation or robotic systems 
that reliably create ordered arrays of clones may also be 
employed . 

Most preferably, said transfer is made to a planar carrier 
which is subsequently placed on the at least three selective 
media as specified in steps (ca) and (cb) . Alternatively, said 
transfer of said host cells may be made to the planar carrier 
already placed on the selective media or said transfer may be 
made directly to the selective media. 

Most advantageously, said transfer is effected in a regular 
grid pattern at densities of 1 to 1000 clones per square 
centimeter. The progeny of said host cells may be transferred 
to a variety of planar carriers. Most preferred is a membrane 
which may, for example, be manufactured from nylon, nitro- 
cellulose or PVDF. 

The selective media used for growth of appropriate clones may 
be in liquid or in solid form. Preferably, said selective 
media when used in conjunction with a spotting robot and 
membranes as planar carriers are solidified with agar on which 
said spotted membranes are subsequently placed. Alternatively, 
and also preferably, said selective media when in liquid form 
are held within microtiter plates and said transfer is made by 
replication. 

Referring now to the step (d) of the method of the invention, 
the readout system can be analyzed by a variety of means. For 
example, it can be analyzed by visual inspection, radioactive, 
chemi luminescent, fluorescent, photometric, spectrometric, 
infra red, colourimetric or resonant detection. 

Preferably, said identification of host cells that express 
interacting fusion proteins is effected by visual means from 
consideration of the activation state of said readout system 
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of clones grown on the at least three selective media as 
specified in steps (ca) and (cb) . 

Also preferably, said identification of host cells that 
express interacting fusion proteins in step (d) is effected or 
assisted by digital image storage, analysis or processing. In 
this embodiment, positive clones which are preferably arrayed 
on a planar carrier such as a membrane are identified by 
comparison of digital images obtained from the membrane after 
activation of said readout system on said selective media 
specified in (ca) and (cb) . 

Most preferably, the identity of positive host cells and false 
positive host cells are stored on computer, for example within 
a relational database. 

Identification of the at least one member of the pair or 
complex of interacting molecules may be effected by a variety 
of means. For example, molecules can be characterized by 
nucleic acid hybridization, oligonucleotide hybridization, 
nucleic acid or protein sequencing, restriction digestion, 
spectrometry or antibody reaction. Once the first member of an 
interaction has been identified, the second member or further 
members can also be identified by any of the above methods. 
Preferably the identification of at least one member of an 
interaction is effected by nucleic acid hybridization, 
antibody binding or nucleic acid sequencing. 

If nucleic acid hybridization is to be carried out, the 
nucleic acid molecules comprised in the host cell and encoding 
for at least one of the interacting molecules is preferably 
affixed to a planar carrier. As is well known in the art, said 
planar carrier to which said nucleic acid may be affixed, can 
be for example, a Nylon-, nitrocellusose- or PVDF membrane, 
glass or silica substrates (DeRisi et al . 1996; Lockhart et 
al. 1996) . Said host cells containing said nucleic acid may be 
transferred to said planar carrier and subsequently lysed on 
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the carrier and the nucleic acid released by said lysis is 
affixed to the same position by appropriate treatment. 
Alternatively, progeny of the host cells may be lysed in a 
storage compartment and the crude or purified nucleic acid 
obtained is then transferred and subsequently affixed to said 
planar carrier. Advantageously, said nucleic acids are 
amplified by PCR prior to transfer to the planar carrier. Most 
preferably said nucleic acid is affixed in a regular grid 
pattern in parallel with additional nucleic acids representing 
different genetic elements encoding interacting molecules. As 
is well known in the art, such regular grid patterns may be at 
densities of between 1 and 50 000 elements per square 
centimeter and can be made by a variety of methods. 
Preferably, said regular patterns are constructed using 
automation or a spotting robot such as described in Lehrach et 
al. (1997) and Maier et al . (1997) and furnished with defined 
spotting patterns, barcode reading and data recording 
abilities. Thus it is possible to correctly and unambiguously 
return to stored host cells containing said nucleic acid from 
a given spotted position on the planar carrier. Also 
preferably, said regular grid patterns may be made by 
pipetting systems, or by microarraying technologies as 
described by Shalon et al . (1996), Schober et al (1993) or 
Lockart et al . (1996). Identification is, again, 
advantageously effected by nucleic acid hybridization. 

Using a detectable nucleic acid probe of interest, homologous 
nucleic acids which are affixed on the planar carrier can be 
identified by hybridization. From the spotted position of said 
homologous identified nucleic acid on the planar carrier, the 
corresponding host cell in the storage compartment can be 
identified which contains both or all members of the 
interaction. The for example second member of the interaction 
can now be identified by any of the above methods. For 
example, by use of a radioactively labeled Ras probe, 
homologous nucleic acids on the planar carrier can be 
identified by hybridization. The Ras interacting proteins can 
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now be identified from the corresponding host cell that 
contains both the first genetic element homologous to the Ras 
probe and the second genetic element encoding for these Ras 
interacting proteins. 

If multiple oligonucleotide hybridizations are carried out on 
the nucleic acids affixed to the planar carrier, 
oligof ingerprints of all genetic elements encoding the 
interacting proteins can be obtained. These oligof ingerprints 
can be used to identify all members of the interactions or 
those members that belong to specific gene families,- as 
described in Maier et al . (1997). 

Advantageously, the nucleic acid molecules that encode the 
interacting proteins are, prior to identification such as by 
DNA sequencing, amplified by PCR or in said genetic elements 
in host cells and preferable in E. coli. Amplification of said 
genetic elements is conducted by multiplication of the E. coli 
cells and isolation of said genetic elements. Methods of 
identifying the nucleic acids that encode interacting proteins 
by DNA sequencing and analysis are well known in the art. By 
amplifying and sequencing the nucleic acids that encode for 
both or all members of an interaction from the same clone, the 
identity of both or all members of the interaction can be 
determined. 

If a specific antibody is to be used to determine whether a 
protein of interest is expressed as a fusion protein within an 
interaction library, it is advantageous to affix all fusion 
proteins expressed from the interaction library on to a planar 
carrier. For example, clones of the interaction library that 
express fusion proteins can be transferred to a planar carrier 
using a spotting robot as described in Lehrach et al (1997) . 
The clones are subsequently lysed on the carrier and released 
proteins are affixed onto the same position. Using, for 
example, an anti-HIPl-antibody (Wanker et al. 1997), clones 
from the interaction library that contain HIPl fusion proteins 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



20 

and an unknown interacting fusion protein can be identified. 
The unknown member of the interacting pair of molecules can 
now be identified from the corresponding host cell by any of 
the above methods. The antibodies used as probes may be 
directly detectably labeled. Alternatively, said antibodies 
may be detected by a secondary probe or antibody which may be 
specific for the primary antibody. Various alternative 
embodiments using, for example, tertiary antibodies may be 
devised by the person skilled in the art on the basis of his 
common knowledge . 

Most advantageously, when said identification of members 
comprising an interaction is effected using said regular 
grids, a digital image of the planar carrier after 
hybridization or antibody reaction is obtained and analysis is 
effected by digital image storage, processing or analysis 
using an automated or semi -automated image analysis system, 
such as described in Lehrach et al. (1997). 

Most preferably, the information comprising the identity of 
the host cell and the identity of the interacting molecules 
expressed by the genetic elements contained within the host 
cell are stored on a computer, for example within a relational 
database . 



In accordance with the present invention, it is additionally 
preferred prior to step (a) that a preselection against clones 
that express a single molecule able to activate the readout 
system is carried out on culture media comprising a 
counterselective compound, for example 5-fluoro orotic acid, 
canavanine, cycloheximide or ct-amino-adipate . 

In this embodiment, for example, the URA3 gene is incorporated 
as a component of the readout system. Clones containing only 
one of said genetic elements are placed on a selective medium 
comprising 5-fluoro orotic acid (5-FOA) . In the case that 
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clones that express a single molecule able to activate the 
readout system, 5-FOA is converted into the toxic 5- 
f luorouracil. Accordingly, host cells containing auto- 
activating molecules will die on the selective medium 
containing 5-FOA. 

It is further important to note that the marker used for said 
preselection cannot be used as a selectable or 
counterselectable marker at the same time. 

The present invention also relates to a method for the 
production of a pharmaceutical composition comprising 
formulation said at least one member of the interacting 
molecules identified by the method of the invention in a 
pharmaceutically acceptable form. 

Said pharmaceutical composition comprises at least one of the 
aforementioned compounds identified by the method of the 
invention, either alone or in combination, and optionally a 
pharmaceutically acceptable carrier or exipient. Examples of 
suitable pharmaceutical carriers are well known in the art and 
include phosphate buffered saline solutions, water, emulsions, 
such as oil/water emulsions, various types of wetting agents, 
sterile solutions etc. Compositions comprising such carriers 
can be formulated by conventional methods. These 
pharmaceutical compositions can be administered to subject in 
need thereof at a suitable dose. Administration of the 
suitable compositions may be effected by different ways, e.g., 
by intravenous, intraperitoneal, subcutaneous, intramuscular, 
topical or intradermal administration. The dosage regimen will 
be determined by the attending physician and other clinical 
factors. As is well known in the medical arts, dosages for any 
one patient depends upon many factors, including the patient's 
size, body surface area, age, the particular compound to be 
administered, sex, time and route of administration, general 
health, and other drugs being administered concurrently. 
Dosages will vary but a preferred dosage for intravenous 
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administration of DNA is from approximately 10 6 to 10 22 copies 
of the nucleic acid molecule. Proteins or peptides may be 
administered in the range of 0,lng to lOmg per kg of body 
weight. The compositions of the invention may be administered 
locally or systematically. Administration will generally be 
parenterally , e.g., intravenously; DNA may also be 
administered directly to the target site, e.g., by biolistic 
delivery to an internal or external target site or by catheter 
to a site in an artery. 

The present invention further relates to a method for the 
production of a pharmaceutical composition comprising 
formulating an inhibitor of the interaction of the interacting 
molecules identified by the method of the invention in a 
pharmaceutically acceptable form. 

The inhibitor may be identified according to conventional 
protocols. Additionally, molecules that inhibit existing 
protein-protein interactions can be isolated with the yeast 
two-hybrid system using the URA3 readout system. Yeast cells 
that express interacting GAL4ad and LexA fusion proteins which 
activate the URA3 readout system are unable to grow on 
selective medium containing 5-FOA. However, when an additional 
molecule is present in these cells which disrupts the 
interaction of the fusion proteins the URA3 readout system is 
not activated and the yeast cells can grow on selective medium 
containing 5-FOA. Using this method potential inhibitors of a 
protein-protein interaction can be isolated from a library 
comprising these inhibitors. Systems corresponding to the URA3 
system may be devised by the person skilled in the art on the 
basis of the teachings of the present invention and are also 
comprised thereby. 

Also, the present invention relates to a method for the 
production of a pharmaceutical composition comprising 
identifying a further molecule in a cascade of interacting 
molecules, of which the at least one member of interacting 
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molecules identified by any of the above methods is a part of 
or identifying an inhibitor of said further molecule. 

Once at least one member of the interacting molecules has been 
identified, it is reasonable to expect that said member is a 
part of a biological cascade. Identification of additional 
members of said cascade can be effected either by applying the 
method of the present invention or by applying conventional 
methods. Also, inhibitors of said further members can be 
identified and can be formulated into pharmaceutical 
compositions . 

The present invention relates further to a kit comprising at 
least one of the following: 

(f) host cells as identified in any of the preceding claims 
and at least one genetic element comprising said genetic 
information specifying at least one of said possibly 
interacting molecules containing a counterselectable 
marker and specified herein above; 

(g) host cells as identified in any of the preceding claims 
and at least one genetic element not comprising genetic 
information specifying at least one of said potentially 
interacting molecules containing a counterselectable 
marker and specified herein above; 

(h) at least one genetic element comprising said genetic 
information specifying at least one of said potentially 
interacting molecules containing a counterselectable 
marker and specified herein above; 

(i) at least one genetic element not comprising genetic 
information specifying at least one of said potentially 
interacting molecules containing a counterselectable 
marker and specified herein above; 
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(j) host cells comprising at least one and preferably at 

least two of said genetic elements specified in (h) or 
(i); 

(k) at least one planar carrier carrying nucleic acid or 
protein from said host cells comprising at least one 
member of said genetic elements specified herein above 
wherein said nucleic acid or protein is affixed to said 
carrier in grid form and optionally solutions to effect 
hybridization or binding of nucleic acid probes or 
proteins to said molecules affixed to said grid; 

(1) at least one storage compartment, planar carrier or 

computer disc comprising or/and characterizing genetic 
elements, host cells, storage compartments or carriers 
identified in any of (f ) to (k) ; and/or 

(m) at least one yeast strain comprising a canl and a cyh2 
mutation. 

Preferably, said kit comprises or also comprises at least one 
storage compartment containing the host cells of (f ) , (g) or 
(j) and/or comprises or also comprises at least one storage 
compartment containing said genetic information or said 
potentially interacting molecules encoded by said genetic 
information as specified in (f ) or (h) . 

The present invention also relates to the use of any of the 
yeast strains described herein above and in the appended 
examples for the identification of at least one member of a 
pair of potentially interacting molecules. 

Advantageously, those molecules identified by the method of 
the present invention as interacting with many different 
molecules can be recorded. This information can reduce the 
work needed to further characterise particular interactions 
since those interactions comprising of a molecule found to 
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interact with many other molecules within a 2H system may be 
suspected of being artif actual (Bartel et al., 1993). 

Preferably, the data obtained by using the method of the 
present invention can be accessed through the use of software 
tools or graphical interfaces that enable to easily query the 
established interaction network with a biological question or 
to develop the established network by the addition of further 
data. 

Accordingly, the present invention further relates to a 
computer implemented method for storing and analysing data 
relating to potential members of at least one pair or complex 
of interacting molecules encoded by nucleic acids originating 
from biological samples, said methods comprising; 

(n) retrieving from a first data-table information for a 

first nucleic acid, wherein said information comprises; 

(oa) a first combination of letters and/or numbers uniquely 
identifying the nucleic acid, and 

(ob) the type of genetic element comprising said nucleic acid 
and 

(oc) a second combination of letters and/or numbers uniquely 
identifying a clone in which a potential member encoded 
by said nucleic acid was tested for interaction with at 
least one other potential member of a pair or complex of 
interacting molecules 

(p) using said second combination of letters and/or numbers 
to retrieve from said first data-table or optionally 
further data- tables, information identifying additional 
nucleic acids encoding for said at least one other 
potential member in step (oc) . 

A preferred embodiment of said method further comprises using 
said second combination of letters and/or numbers in step f3) 
to retrieve from a second data- table further information, 
where said further information at least comprises the 
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interaction class of said clone, and optionally additional 
information comprising, 

(q) the physical location of the clone; and 

(r) predetermined experimental details pertaining to creation 

of said clone, including at least one of: 
(ra) tissue, disease-state or cell source of the nucleic acid; 
(rb) cloning details; and 

(rc) membership of a library of other clones. 

It is additionally preferred, that said method comprises using 
said information of step (o) on said first and/or of step (p) 
on additional nucleic acids to relate to a third data- table 
further characterising said first and/or additional nucleic 
acids, where said further characterising comprises at least 
one of 

(s) hybridization data, 

(t) oligonucleotide fingerprint data, 

(u) nucleotide sequence, 

(v) in- frame translation of the said nucleic acids, and 
(w) tissue, disease-state or cell source gene expression 
data; and 

optionally identifying the protein domain encoded by said 
first or additional nucleic acids. 

Preferably also said method comprises identifying whether said 
potential members encoded by the nucleic acids interact, by 
considering said interaction class of said clone in which 
nucleic acids were tested for said interaction in step f3) . 

More preferably, said data relates to one or more of 10 to 100 
potential members, yet more preferably 100 to 1000 potential 
members, yet more preferably, 1000 to 10000 potential members 
and most preferably more than 10,000 potential members. 
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In a preferred embodiment, said data was generated by the 
aforementioned method for identifying members of a pair or 
complex of interacting molecules. 

In a further preferred embodiment, said interaction class 
comprises one of the following: Positive, or Negative, or 
False Positive. 

It is further preferred, that sticky proteins are identified 
by consideration of the number of occurrences a given member 
is identified to interact with many different members in 
different clones of said positive interaction class. 

More preferably, said first data-table forms part of a first 
database, and said second and third data tables form part of 
at least a second database. 

Yet more preferably, said second database is held on a 
computer readable memory separate from the computer readable 
memory holding said first database, and said database is 
accessed via a data exchange network. 

It is further preferred, that said second database comprises 
nucleic acid or protein sequence, secondary or tertiary 
structure, biochemical, biographical or gene expression 
information. 

In a particularly preferred embodiment, data entry to said 
first, second or further data tables is controlled 
automatically from said first data base by access to other 
computer data, programs or computer controlled robots. 

It is yet more preferred, that at least one workflow 
management system is built around particular sets of data to 
assist in the progress of the aforementioned method for 
identifying members of a pair or complex of interacting 
molecules . 
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Most preferably, said workflow management system is software 
to assist in the progress of the identification of members of 
a pair or complex of interacting molecules using the 
aforementioned method of hybridization of nucleic acids. 

In another preferred embodiment, said data are investigated by 
queries of interest to an investigator. 

More preferably, said queries include at least one of, 

(aa) identifying the interaction or interaction pathway 
between a first and second member of an interaction 
network 

(ab) identifying the interaction pathway between a first and 
second member of an interaction network and through at 
least one further member, 

(ac) identifying the interaction or interaction pathway 
between at least two members characterised by nucleotide 
acid or protein sequences, secondary or tertiary 
structures, and 

(ad) identifying interactions or interaction pathways that are 
different for said different tissue, disease-state or 
cell source. 

Yet more preferably, parts of said information are stored in a 
controlled format to assist data query procedures. 

Even more preferred is a method, wherein the results of said 
queries are displayed to the investigator in a graphical 
manner. 

Yet more advantageous is the method, wherein a sub- set of data 
comprising data characterising nucleic acids identified as 
encoding members of a pair or complex of said interacting 
molecules is stored in a further data-table or data base. 
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Yet more preferably, consideration of the number of 
occurrences a given member is identified to interact with a 
second or further member is used to decide if said data 
characterising nucleic acids form part of said sub- set of 
data. 

Even more preferred is the method, wherein additional 
information or experimental data is used to select those data 
to form part of said subset. 

Most preferably, to speed certain data query procedures, the 
structure in which the data is stored in the computer readable 
memory is modified. 

In another preferred embodiment, the data is held in 
relational or object oriented data bases. 

The invention further relates to a data storage scheme 
comprising a data table that holds information on each member 
of an interaction, where a record in said table represents 
each member of an interaction, and in which members are 
indicated to form interactions by sharing a common name. 

Preferably, in said data storage scheme said common name is a 
clone name or unique combination of letters and/or numbers 
comprising said clone name. 

A computer- implemented method for handling of data gathered 
provides a robust and efficient solution for handling the 
large amount of protein-protein interaction data produced by 
the method of the invention. It provides the ability to 
communicate with and utilise different data-bases and/or other 
data storage systems across intra or internets, interfaces to 
allow querying of the data-base by an investigator and visual 
display of the results of the query. Relational or object 
orientated data-bases, with data-parsing and display programs 
supporting said data-base secures ease of use. By way of 
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example, Figure 2 displays a scheme and features for a set of 
data- tables suitable for managing such interaction data. The 
primary links between table-keys are indicated, as are the 
entry fields or elements to be held within each table. If 
desired, elements of a table may be expanded into an 
additional table holding further data. Likewise, certain 
tables may be expanded into an additional data-base to hold 
and manage further data. Said additional data-base may be 
stored on the same or on remote computers . Elements of the 
table can be recorded in numerical, descriptive or fixed 
format, whatever is most appropriate for the respective data. 
To provide efficient querying, where appropriate, elements are 
recorded in controlled vocabulary. Figure 3 displays in what 
part of the work process during an interaction experiment each 
table is most relevant and where it forms the underlying data- 
set from which work- flow management software for that part of 
the process is based. 

Other computer-based methods of generating visual 
representations of specific interactions, partial or complete 
protein-protein interaction networks can be employed to 
automatically calculate and display the required interactions 
most efficiently. As is well known in the art, computer data- 
bases are a valuable resource for large-scale biological and 
molecular biological research. 

In summary, a significant advantage of the method of invention 
over existing yeast 2H systems is the scale at which such 
identification of interactions and interaction members can be 
made. Preferably, the method of invention screens library vs. 
library interactions using arrayed interaction libraries. 
Thus, the method of invention allows, in an efficient manner, 
a more complete and exhaustive generation of protein-protein 
interaction networks than existing methods. An established and 
exhaustive network of protein-protein interactions is of use 
for many purposes as shown in Figure 1. For examples, it may 
be used to predict the existence of new biological 
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interactions or pathways, or to determine links between 
biological networks. Furthermore with this method, the 
function and localisation of previously unknown proteins can 
be predicted by determining their interaction partners. It 
also can be used to predict the response of a cell to changes 
in the expression of particular members of the networks. 
Finally, these data can be used to identify proteins or 
interactions between proteins within a medically relevant 
pathway which are suitable for therapeutic intervention, 
diagnosis or the treatment of a disease. 

The figures show: 

Figure 1 

The applications of an established and exhaustive network of 
protein-protein interactions. The identity of positive clones 
and the identity of the members comprising the interactions 
for the entire interaction library are stored in a database. 
These data are used to establish a network of protein-protein 
interactions which can be used for a variety of purposes. For 
example, to predict the existence of new biological 
interactions or pathways, or to determine links between 
biological networks. Furthermore with this method, the 
function and localisation of previously unknown proteins can 
be predicted by determining their interaction partners. It 
also can be used to predict the response of a cell to changes 
in the expression of particular members of the networks. 
Finally, these data can be used to identify proteins within a 
medically relevant pathway which are suitable for therapeutic, 
diagnosis intervention and for the treatment of disease. 

Figure 2 

A scheme and features for a set of data-tables suitable for 
storing, managing and retrieving data from a large-scale 
protein-protein interaction screen. The scheme could be 
implemented in either relational or object-orientated data- 
bases. The primary links between table-keys are indicated, as 
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are the suggested fields or elements to be held within each 
table . 

Figure 3 

A process flow representing the experimental and informatic 
flow during a large-scale protein-protein interaction screen. 
The figure displays in which part of the experimental steps 
each table from a the data-base described above is most 
applicable. Each table forms the underlying data-set from 
which work- flow management software for that part of the 
process is based. 

Figure 4 

Plasmids constructed for the improved 2 -hybrid system. 

The plasmid maps of the pBTM118a, b and c DNA binding domain 
vector series and the pGAD428a, b and c activation domain 
vector series. Both plasmids contain the unique restriction 
enzyme sites for Sal I and Not I which can be used to clone a 
genetic fragment into the multiple cloning site. The plasmids 
are maintained in yeast cells by the selectable markers TRP1 
and LEU2 respectively. The loss of the plasmids can be 
selected for by the counterselective markers CAN1 and CYH2 
respectively. 

Polylinkers used within the multiple cloning site to provide 
expression of the genetic fragment in one of the three reading 
frames . 

Figure 5 

The structure of the URA3 readout system carried by the 
plasmid pLUA. Important features of pLUA include the URA3 gene 
which is under the transcriptional control of the lexAop-GALl 
promoter, the ADE2 selectable marker that allows yeast ade2- 
auxotrophs to grow on selective media lacking adenine and the 
^-lactamase gene (bla) which confers ampicilin resistance in 
E.coli. The pLUA plasmid replicates autonomously both in yeast 
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using the 2/i replication origin and in E.coli using the ColEl 
origin of replication. 

Figure 6 

A schematic overview of one embodiment of the method of the 
invention. For the parallel analysis of a network of protein- 
protein interactions using the method of the invention, a 
library of plasmid constructs that express DNA binding domain 
and activation domain fusion proteins is provided. These 
libraries may consist of specific DNA fragments or a multitude 
of unknown DNA fragments ligated into the improved binding 
domain and activating domain plasmids of the invention which 
contain different selectable and counterselectable markers. 
Both libraries are combined within yeast cells by 
transformation or interaction mating, and yeast strains that 
express potentially interacting proteins are selected on 
selective medium lacking histidine. The selective markers TRP1 
and LEU2 maintain the plasmids in the yeast strains grown on 
selective media, whereas CAN1 and CYH2 specify the counter- 
selectable markers that select for the loss of each plasmid. 
HIS3 and lacZ represent selectable markers in the yeast 
genome, which are expressed upon activation by interacting 
fusion proteins. The readout system is, in the present case, 
both growth on medium lacking histidine and the enzymatic 
activity of S-galactosidase which can be subsequently 
screened. A colony picking robot is used to pick the resulting 
yeast colonies into individual wells of 384 -well microtiter 
plates, and the resulting plates are incubated at 30 °C to 
allow cell growth. The interaction library held in the 
microtiter plates optionally may be replicated and stored. The 
interaction library is investigated to detect positive clones 
that express interacting fusion proteins and discriminate them 
from false positive clones using the method of the invention. 
Using a spotting robot, cells are transferred to replica 
membranes which are subsequently placed onto one of each of 
the selective media SD-leu-trp-his, SD-leu+CAN and SD- trp+CHX . 
After incubation on the selective plates, the clones which 
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have grown on the membranes are subjected to a S-Gal assay and 
a digital image from each membrane is captured with a CCD 
camera which is then stored on computer. Using digital image 
processing and analysis clones that express interacting fusion 
proteins can be identified by considering the pattern of S-Gal 
activity of these clones grown on the various selective media. 
The individual members comprising the interactions can then be 
identified by one or more techniques, including PCR, 
sequencing, hybridisation, oligof ingerprinting or antibody 
reactions . 

Figure 7 

A schematic overview of one embodiment of the method of the 
invention. For the parallel analysis of a network of protein- 
protein interactions with the method of the invention, two 
libraries of plasmid constructs that express DNA binding 
domain or activation domain fusion proteins are provided. 
These libraries may consist of specific DNA fragments or a 
multitude of unknown DNA fragments ligated into binding domain 
and activating domain plasmids which contain the selectable 
markers TRP1 and LEU2, an doptionally the counterselective 
markers CAN1 and CYH2 respectively. The libraries are 
transformed into either Mata or Mata yeast strains containing 
the URA3 readout system and are subsequently plated onto 
selective media containing 5-f luoroorotic acid (5-FOA) . Only 
those yeast cells that express fusion proteins unable to auto- 
activate the URA3 readout system will grow in the presence of 
5-FOA. The resulting yeast strains that express only non-auto- 
activating proteins can then be directly used in an automated 
interaction mating approach to generate ordered arrays of 
diploid strains which can be assayed for activation of the 
lacZ readout system, a) Individual yeast cells that express 
single fusion proteins unable to activate the URA3 readout 
system are transferred into wells of a 384-well microtiter 
plate using a modified picking robot. The yeast strains held 
in the microtiter plates can optionally be replicated and 
stored. The microtiter plates contain a growth medium lacking 
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amino acids appropriate to maintain the corresponding plasmids 
in the yeast strains. The interaction matings are subsequently- 
performed by automatically transferring a Mata and a Mata 
yeast strain to the same position on a Nylon membrane using 
automated systems as described by Lehrach et al. (1997) . 
Alternatively, a pipetting or micropipetting system (Schober 
et al. 1993) can be used to transfer small volumes of 
individual liquid cultures of a yeast strain onto which a lawn 
of yeast cells derived from at least one yeast clone of the 
opposite mating type is sprayed or applied. Yeast strains may 
be applied singly or as pools of many clones. By both methods 
ordered arrays of yeast clones are incubated overnight at 3 0 °C 
to allow interaction mating to occur. The resulting diploid 
cells are then analysed in a S-Gal assay as described by 
Breeden & Nasmyth (1985) . b) Yeast strains that grew on 
selective media containing 5-FOA are pooled and interaction 
mating between the Mata and Mata strains is made within liquid 
YPD medium. Those diploid yeast strains that express 
interacting proteins are selected by plating on selective 
medium lacking histidine and uracil. The selective markers 
TRPl and LEU2 maintain the plasmids in yeast strains grown on 
selective media. HIS3, URA3 and lacZ represent reporter genes 
in the yeast cells, which are expressed on activation by 
interacting fusion proteins. The readout system is, in the 
present case, growth on medium lacking histidine and/or uracil 
and enzymatic activity of S-galactosidase which can be 
screened at a later time point. A modified colony picking 
robot is used to pick the diploid yeast colonies into 
individual wells of 384-well microtiter plates containing 
selective medium, and the resulting plates are incubated at 
30°C to allow cell growth. The interaction library optionally 
may be replicated and stored. Using a spotting robot, diploid 
cells are transferred to replica membranes which are 
subsequently placed onto growth medium. Replica membranes are 
placed on the counterselective media SD-trp+CHX or SD-leu+CAN. 
The resulting regular arrays of diploid yeast clones are 
analysed for fi-Gal activity as described by Breeden & Nasmyth 
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(1985) . In either case a) and b) , a digital image from each 
dried membrane is captured with a CCD camera which is then 
stored on computer. Using digital image processing and 
analysis clones that express interacting fusion proteins can 
be identified by considering the fi-Gal activity of these 
clones spotted in a defined pattern grown the membranes placed 
on the variious selective media. The individual members 
comprising the interactions can then be identified by one or 
more techniques, including PCR, sequencing, hybridisation, 
oligof ingerprinting or antibody reactions. 

Figure 8 

Predicted interactions between fusion proteins used to create 
the defined interaction library. The fusion proteins enclosed 
with dark rounded boxes are believed to interact as shown. The 
LexA-HIPl and GAL4ad-LexA fusion proteins enclosed by thin 
rectangular boxes have been shown to activate the LacZ readout 
system without the need for any interacting fusion protein. 
The two proteins LexA and GAL4ad, and the three fusion 
proteins GAL4ad-HIPCT, GAL4ad-14-3-3 and LexA-MJD {all 
unboxed) are believed not to interact with each other or other 
fusion proteins used in this example. 

Figure 9 

Identification of positive clones that contained interacting 
fusion proteins from false positive clones using the method of 
the invention. Three different yeast clones each containing 
pairs of plasmid constructs (positive control: pBTM117c-SIMl & 
pGAD4 2 7 - ARNT ; negative control: pBTM117c & pGAD427 and false- 
positive control: pBTM117c-HIPl & pGAD427) were transferred by 
hand to four agar plates each containing a different selective 
medium (SD-leu-trp, SD-leu-trp-his , SD-leu+CAN and SD- 
trp+CAN) , and incubated for 48 hours at 30 °C. The yeast 
colonies were subsequently transferred to a Nylon membrane and 
assayed for (3-gal activity by the method of Breeden and 
Nasmyth (1985) . 
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Figure 10 

Digital images of the fi-gal assays made from the replica Nylon 
membranes containing the defined interaction library obtained 
from the selective media (a) SD-leu-trp-his, (b) SD-trp+CHX 
and (c) SD-leu+CAN. In each case, The left hand side of each 
membrane contains control clones and clones from the defined 
interaction library, and the right hand side contains only 
clones from the defined interaction library. The two regions 
marked on the first membrane represent those clones magnified 
in Figure 11. The overall size of each membrane is 22 x 8 cm 
and contains 6912 spot locations at a spotting pitch of 1.4 
mm. 

Figure 11 

Magnification of clones from the interaction library taken 
from the same region of three membranes obtained from the 
selective media SD-leu-trp-his, SD-trp+CHX and SD-leu+CAN 
assayed for (5-gal activity: 

Clones imaged from a region of the right hand side of the 
membrane containing the defined interaction library. Clones 
from the defined interaction library that express interacting 
proteins are ringed and correspond to the microtiter plate 
addresses 06L22 and 08N24 . 

Clones imaged from a region of the left hand side of the same 
membranes containing control clones and clones from the 
interaction library, where clones around each ink guide- spot 
are arranged as shown and correspond to: 00 Ink guide spot; 01 
False positive control clone that expresses the fusion protein 
GAL4ad- LexA ; 02 False positive clone expressing the fusion 
protein LexA-HIPl; 03 Positive control clone expressing the 
interacting fusion proteins LexA-SIMl & GAL4 ad - ARNT ,* 04 Clone 
from the defined interaction library. The positive control 
clone (spot position 03) is ringed. 
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Figure 12 

A subset of the list of clones identified by computer query of 
data produced by automated image analysis and quantification 
of the S-galactosidase activity. Each record represents the £- 
galactosidase activity for a given clone grown on three 
selective media. This program queried the data to identify all 
clones from the interaction library that had activated the 
reporter gene (score > 0) when grown on minimal medium 
lacking, leucine, trptophan, and histidine (SD-leu-trp-his) , 
yet had not on either of the counterselective media (score on 
both media equal to 0) . 

Two positive clones 06L22 and 08N24 characterised by 
hybridisation are present within the computer file. 

Figure 13 

Characterisation by hybridisation of the genetic fragments 
carried by the clones 06L22 and 08N24. A 1.3 kb, SIM1 and a 
1.4 kb ARNT DNA fragment were used as nucleic acid probes for 
hybridisation to high-density spotted membranes containing DNA 
from the defined interaction library. These clones were 
characterised as containing SIM1 and ARNT genetic fragments by 
hybridisation. The images are of the same region of the 
membranes as those shown in Figure 11 a. The spot positions of 
the clones 06L22 and 08N24 are ringed. 

Figure 14 

Identification of the SIM1 and ARNT DNA fragments from the 
yeast two hybrid plasmid carried by the clone 06L22 by duplex 
PCR. Plasmid DNA was isolated from a liquid culture of the 
clone 06L22 by a QiaPrep (Hilden) procedure and the inserts 
contained within the plasmids were amplified by PCR using the 
primer pairs, 5 1 -TCG TAG ATC TTC GTC AGC AG- 3' & 5 1 -GGA ATT 
AGC TTG GCT GCA GC-3 1 for the plasmid pBTM117c and 5'-CGA TGA 
TGA AGA TAC CCC AC- 3 1 & 5 1 -GCA CAG TTG AAG TGA ACT TGC-3 ' for 
pGAD427. Lane 1 contains a Lamda DNA digestion with BstEII as 
size marker; Lane 2 contains the duplex PCR reaction from 
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plasmids isolated from clone 06L22; Lanes 3 and contain 
control PCR amplifications from the plasmids pBTM117c-SIMl and 
pGAD4 27- ARNT respectively. 

Figure 15 

Readout system ativation for clones in a regular grid pattern 
from an interaction library. 23 384-well microtiter plates of 
the sea urchin interaction library were spotted in a ^3x3 
duplicate' regular grid pattern around an ink guide-spot on a 
222 x 222 mm porous membrane (Hybond N+, Amersham, UK) using a 
spotting robot. The membrane was incubated in SD-leu-trp-his 
medium for 3 days, assayed for lacZ expression using the fi-gal 
assay as described by Breeden & Nasmyth (1985) and air dried 
overnight . A digital image was captured using a standard A3 
computer scanner. 

Figure 16 

Hybridisation of a gene fragment (Probe A) encoding for 
Protein A to an array of DNA from an interaction library. The 
probe was labelled radioactively by standard protocols, and 
hybridisation-positive clones from the interaction library are 
identified by the automated image analysis system. The 
position of clone 5K20, from which the gene fragment was 
isolated, is indicted. Other hybridisation-positive clones 
also carry this gene -fragment, and by recovery of interacting 
members from these clones, a protein-protein interaction 
pathway for Protein A can be uncovered. 

Figure 17 

A graphical representation of the hybridisation-positive 
clones generated by hybridisation of Probe A to a DNA array 
representing the interaction library . 

Figure 18 

A graphical representation of hybridisation- and interaction- 
positive clones generated by a subsequent hybridisation with 
probe B (isolated from the clone marked in a grey box) . Also 
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shown, are the positions of the hybridisation-positive clones 
from probe A. Interaction-positive clones that carry both gene 
fragments are identified as hybridising with both probes. 

Figure 19 

A graphical representation of hybridisation- and interaction- 
positive clones generated by a further hybridisation with 
probe C isolated from the clone 6D18 (marked by a grey box and 
"B/C") . Also shown are the hybridisation signals for probes A 
and B. By considering common hybridisation signals for 
interaction-positive clones and subsequent DNA sequencing of 
the inserts carried by these clones, protein-protein 
interactions can be uncovered. The figure also shows an 
interaction pathway uncovered between Proteins A, B an C based 
on these data. 

Figure 20 

Automated visual differentiation of yeast cells expressing 
single fusion proteins able to activate the LacZ readout 
system. A defined library of L40ccu yeast clones expressing 
different fusion proteins cloned in the plasmid pBTM117c was 
plated ^.onto minimal medium lacking tryptophan, buffered to pH 
7.0 with potassium phosphate and containing 2 ug/ml of X-Gal 
(SD-trp/XGAL) . White colonies that have not autoctivated the 
LacZ reporter gene are automatically recognised and marked 
with a red horizontal cross. A colony that has turned blue due 
to expression of a single fusion protein able to auto-activate 
the LacZ reporter gene is automatically recognised due to its 
darker colour and the presence of a "hole 1 . An arrow indicates 
this colony. All colonies unsuitable for further analysis and 
picking (including those too small or touching colonies) are 
automatically recognised and marked with a blue diagonal 
cross . 

Figure 21 

Results of automated interaction mating to identify diploid 
yeast strains that express interacting fusion proteins, a) 
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Progeny of the yeast strains xla and x2a were spotted at 
positions 1 and 2 on a nylon membrane using a spotting robot 
such as described by Lehrach et al. (1997) . The yeast strains 
yla and y2cc of the opposite mating type were subsequently 
spotted on positions 1 and 2 which already contained cells 
from the strains xla and x2a. To assist in recognition of the 
duplicate spotting pattern, ink was spotted in position 2 
directly to the right of the spotted yeast clones, b) The 
membrane was transferred to a YPD agar plate and was incubated 
at 30° C overnight to allow interaction mating to occur, c) 
Diploid yeast cells that had grown on the membrane were 
subsequently analysed for S-galactosidase activity using the 
method of Breeden & Nasmyth (1985) . 

Figure 22 

The two vectors constructed to provide further genetic 
features to enable the method of invention within a 
prokaryotic two-hybrid system. The vectors are based on the 
pBAD series of vectors which provide tight inductive- control 
of expression of cloned genes using the promoter from the 
arabinose operon (Guzman et al., 1995 J. Bact . 177: 4141- 
4130), and can be maintained in the same E.coli cell by virtue 
of compatible origins of replication. 

The plasmid pBAD 1 8 - aRNAP expresses under the control of the 
arabiose promoter, fusion proteins between the a amino 
terminal domain (NTD) of the a-subunit of RNA polymerase and 
DNA fragments cloned into the multiple cloning site. The 
presence of this plasmid in kanamycin sensitive cells can be 
selected by plating on growth medium supplemented with 
kanamycin, or for its absence by the counterselective rpsL 
allele by plating on media supplemented with streptomycin 
(Murphy et al. 1995) . 

The plasmid pBAD30-cI expresses under the control of the 
arabinose promoter, fusion proteins between the Xcl protein 
and DNA fragments cloned into the multiple cloning site. The 
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presence of this plasmid in amplicillin sensitive cells can be 
selected by plating on growth medium supplemented with 
amplicillin, or for its absence by the counterselective lacY 
gene by plating on media supplemented with 2-nitrophenyl-S-D- 
thiogalactosidase (tONPG) (Murphy et al. 1995). Additionally, 
the oriT sequence enables unidirectional genetic exchange of 
the pBAD30-cI plasmid and its derivatives from E.coli cells 
containing the F 1 fertility factor to F~ strains lacking the 
fertility factor. 



Examples 

Example 1: Construction of vectors yeast strains and 

readout system for an improved yeast two-hybrid system 

1.1 Construction of vectors 

The plasmids constructed for an improved yeast two-hybrid 
system pBTM118 a, b and c and pGAD428 a, b and c are shown in 
Figure 4. Both sets of vectors can be used for the 
construction of hybrid (fusion) proteins. The vectors contain 
the unique restriction sites Sal I and Not I located in the 
multiple cloning site (MCS) region at the 3'- end of the open 
reading frame for either the lexA coding sequence or the 
GAL4ad sequence Figure 4b) . 

With both sets of plasmids fusion proteins are expressed at 
high levels in yeast host cells from the constitutive ADH1 
promoter (P) and the transcription is terminated at the ADH1 
transcription termination signal (T) . The two-hybrid plasmids 
shown in Figure 4a are shuttle vectors that replicate 
autonomously in both E. coli and S.. cerevisiae. 

The three plasmids pBTMH8 a, b and c are used to generate 
fusions of the LexA protein (amino acids 1-220) and a protein 
of interest cloned into the MCS in the correct orientation and 
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reading frame. The plasmids pBTM118 a, b and c are derived 
from pBTM117c (Wanker et al., 1997) by insertion of the 
adapters shown in Table 1 into the restriction sites Sal I and 
Not I to generate the improved vectors with three different 
reading frames. 

The plasmids pBTM118 a, b and c carry the wild type yeast CAN1 
gene for counterselection, which confers sensitivity to 
canavanine in transformed yeast cells (Hoffmann, 1985) . The 
plasmids also contain the selectable marker TRP1, that allows 
yeast trpl-auxotrophs to grow on selective synthetic medium 
without tryptophan, and the selectable marker bla which 
confers ampicillin resistance in E. coli. 

The plasmids pGAD428 a, b and c are used to generate fusion 
proteins that contain the GAL4 activation domain (amino acids 
768-881) operatively linked to a protein of interest. The 
plasmids pGAD428 a, b and c carry the wild type yeast CYH2 
gene, which confers sensitivity to cycloheximide in 
transformed cells (Kaeufer et al., 1983), the selectable 
marker LEU2, that allows yeast leu2-auxotrophs to grow on 
selective synthetic medium without leucine, and the bacterial 
marker aphA (Pansegrau et al., 1987) which confers kanamycin 
resistance in E. coli. The plasmids pGAD428a, b and c were 
created from pGAD427 by ligation of the adapters shown in 
Table l into the MCS to construct the improved vectors with 
three different reading frames. 

For the construction of pGAD427 a 1.2 kb Dde I fragment 
containing the aphA gene was isolated from pFGlOlu (Pansegrau 
et al., 1987) and was subcloned into the Pvu I site of the 
pGAD426 using the oligonucleotide adapters 5'- GTCGCGATC-3 1 
and 5 1 -TAAGATCGCGACAT-3 1 . The plasmid pGAD42 6 was generated by 
insertion of a 1.2 kb Eco RV CYH2 gene fragment, which was 
isolated from the pAS2-l (Clonetech) into the Pvu II site of 
pGAD425 (Han and Collicelli, 1995) . 
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1.2 Construction of yeast strains 

To allow for the improved yeast two-hybrid system, three 
Saccharomyces cerevisiae strains L40cc , L40ccu and L40ccua 
were created. The S. cerevisiae. L40cc was created by site 
specific knock-out of the CYH2 and CAN1 genes of L40 
(Hollenberg et al., Mol. Cell. Biol. 15: 3813-3822), and 
L40ccu created by site specific knock-out of the URA3 gene of 
L40cc (Current Protocols in Molecular Biology, Eds. Ausubel et 
al. John Wiley & Sons: 1992) The strain L40ccua was created by 
conducting a mating-type switch of the strain L40ccu by 
standard procedures (Ray BL, White CI, Haber JE (1991)). The 
genotype of the L40cc strain is: Mata his3A200 trpl-901 leu2- 
3,112 ade2 LYS2 : : (lexAop) 4 -HIS3 URA3 : : (lexAop) 8 -lacZ GAL4 canl 
cyh2 f The genotype of the L40ccu strain is: Mata his3A200 
trpl-901 leu2-3,112 ade2 LYS2 :: (lexAop) 4 -HIS3 ura3 :: (lexAop) 8 - 
lacZ GAL4 canl cyh2 , and that of L40ccua is Mata his3A200 
trpl-901 leu2-3,112 ade2 LYS2 :: (lexAop) 4 -HIS3 ura3 : : (lexAop) 8 - 
lacZ GAL4 canl cyh2 . 

1.3 Readout system 

Figure 5 shows the URA3 readout system carried by the plasmid 
pLUA. This URA3 readout system under the control of a 
bacterial LexAop upstream activation sequence (UAS) can be 
used within the yeast 2 -hybrid system both as a counter 
selective reporter gene and as a positive selection reporter 
gene to eliminate false positive clones . The plasmid contains 
the features of the UASiexAop-U*^ readout system, the 
selectable marker ADE2 that allows yeast ade2 -auxotrophs to 
grow on selective media without adenine and the bla gene which 
confers amplicillin resistance in E.coli. The plasmid pLUA is 
a shuttle vector that replicates autonomously in E. coli and 
yeast . 

For the construction of pLUA a 1.5 kb Sac I/Cla I UAS lexAop - 
URA3 fragment was isolated from pBS-lexURA and ligated 
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together with a 2.4 kb Sac I/Cla I ADE2 fragment into Cla I 
digested pGAD425A. pBS-lexURA was generated by ligating URA3 
fragment together with a UASi exAop fragment into pBluescript 
SK+ . The URA3 and UASi exAop fragments were obtained by PCR 
using genomic DNA from S. cerevisiae strain L40c using 
standard procedures and anchor primers which gave rise to 
complementary overhangs between the two consecutive fragments 
which were subsequently anealed to generate the chimeric 
sequence (see, for example, Current Protocolls in Molecular 
Biology, Eds. Ausubel et al. John Wiley & Sons: 1992) . The 
ADE2 gene was isolated by PCR using genomic DNA from SEY6210a. 
pGAD425A was generated by deleting of an 1.2 kb Sph I fragment 
from pGAD425 (Han and Colicelli, 1995) and religation of the 
vector. 

1.4 Generation of a defined interaction library 

To determine if the invention could be used in an improved 
two-hybrid system for yeast, as shown in Figure 6 or Figure 7, 
a defined interaction library of plamids that express various 
LexA and GAL4ad fusion proteins of interest was constructed 
using the vectors and strains described in sections 1.1 and 
1.2. The orientation of the inserted fragments was determined 
by restriction analysis and the reading frame was checked by 
sequencing. The generated constructs and the original plasmids 
described above are listed in Table 2. The construction of 
pBTMll7c-HDl.6, -HD3.6 and -SIM1 was described elsewhere 
(Wanker et al., 1997/ Probst et al., 1997). pBTM117c-HIPl and 
pGAD427-HIPl were obtained by ligation of a 1 . 2 kb Sal I HIP1 
fragment isolated from pGAD-HIPl (Wanker et al., 1997) into 
pBTM117c and pGAD427, respectively. pBTM117c-MJD was created 
by inserting a 1 . 1 kb Sal I/Not I MJD1 fragment (Kawagushi et 
al., 1994) into pBTM117c, and pGAD427-14-3-3 was generated by 
inserting a 1.0 kb EcoRI/NotI fragment of pGAD10-14-3-3 into 
pGAD427. For the construction of pGAD427-HIPCT, a 0 . 5 kb Eco 
RI HIP1 fragment isolated from pGAD-HIPCT (Wanker et al., 
1997) was ligated into pGAD427 . pGAD427-lexA and pGAD4 2 7 - ARNT 
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were generated by insertion of a 1.2 kb Sal I /Not digested 
lexA PCR fragment and a 1.4 kb Sal I/Not I ARNT fragment into 
pGAD427 respectively. 

It was shown that the fusion proteins LexA-SIMl and GAL4ad- 
ARNT specifically interact with each other in the yeast two- 
hybrid system (Probst et al., 1997), because when both hybrids 
were coexpressed in Saccharomyces cerevisiae containing two 
integrated reporter constructs, the yeast HIS3 gene and the 
bacterial lacZ gene, which both contained binding sites for 
the LexA protein in the promoter region, the interaction 
between these two fusion proteins led to the transcription of 
the reporter genes. The fusion proteins by themselves were not 
able to activate transcription because GAL4ad-ARNT lacks a DNA 
binding domain and LexA-SIMl an activation domain (Probst et 
al., 1997) . In contrast it was shown recently that the fusion 
proteins LexA-HIPl and GAL4ad-LexA are capable of activating 
the HIS3 and lacZ reporter genes without interacting with a 
specific GAL4ad or LexA fusion protein respectively. Thus, the 
yeast clones expressing the LexA-HIPl protein have to be 
designated as false positives, because false positives are 
defined here as clones where a GAL4ad fusion protein or a LexA 
fusion protein alone without the respective partner protein 
activates the transcription of the reporter genes without the 
need for any interacting partner protein. 

The predicted protein-protein interactions of these fusion 
proteins are shown in Figure 8. It was shown that the fusion 
proteins LexA-SIMl & GAL4 ad - ARNT , LexA-HD1.6 & GAL4ad-HIPl and 
LexA-HD3.6 & GAL4ad-HIPl specifically interact with each other 
in the yeast two-hybrid system because they only activate the 
reporter genes HIS3 and lacZ when both proteins are present in 
one cell (Probst et al . 1997; Wanker et al . 1997). In 
contrast, it was demonstrated that the LexA-HIPl and GAL4ad- 
LexA fusion proteins are capable of activating the reporter 
genes without the need for any interacting fusion protein. The 
proteins LexA and GAL4ad and the fusion proteins LexA-MJD and 
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GAL4ad-14-3-3 which are also present in the defined 
interaction library are unable to activate the reporter genes 
either alone or when present in the same cell with any other 
fusion proteins comprising the library. 

Example 2: Detection of clones expressing known 
interacting proteins from false positives using the improved 
two-hybrid system 

Pairs of the yeast two-hybrid plasmids pBTM117cSIMl & pGAD427- 
ARNT, pBTM117c & pGAD427 and pBTM117c-HIPl & pGAD427 were 
transformed into the yeast strain L40cc, and Trp+Leu+ 
transf ormants that contained at least one of each of the two 
plasmids were selected on SD-leu-trp plates. Two transf ormants 
from each transformation were investigated for the presence of 
protein-protein interactions by testing the ability of the 
yeast cells to grow on SD-leu-trp, SD-leu- trp-his , SD-leu+CAN 
and SD-trp+CHX plates and by the 6-galactosidase membrane 
assay (Breeden and Nasmyth, 1985) . Figure 9 shows that the 
yeast strains cells harboring both the plasmids pBTM117c-SIMl 
& GAD427-ARNT or pBTM117c-HIPl & pGAD427 grow on SD-leu-trp- 
his plates and turned blue after incubation in X-Gal solution, 
indicating that the HIS3 and lacZ reporter genes are activated 
in these strains. In comparison, the yeast strain harboring 
both the negative control plasmids pBTM117c & pGAD427 was not 
able to grow on this medium and also showed no lacZ activity. 
After selection of the yeast strains harboring the different 
combinations of the two-hybrid plasmids on SD-leu+CAN and SD- 
trp+CHX the resulting strains were also analyzed by the fi- 
galactosidase assay. After incubating the membrane containing 
all three strains on SD-trp+CHX medium only progeny of the 
yeast strain that originally harbored both the plasmids 
pBTM117c-HIPl & pGAD427 yet which had lost the pGAD427 plasmid 
through counterselection turned blue after incubating in X-Gal 
solution. This result indicates that this clone is a false 
positive, because although showing a lacZ+ phenotype when 
grown on SD-leu-trp-his medium, the LexA-HIPl fusion protein 
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was also capable of activating the HIS3 and lacZ genes on SD- 
trp+CAN medium without the need for any interacting fusion 
protein. In comparison, the yeast strain harboring both the 
plasmids pBTM117c-SIMl & pGAD427-ARNT is a positive clone that 
expresses interacting LexA and GAL4ad fusion proteins, because 
both the LexA and the Gal4ad fusion proteins are necessary for 
the activation of the reporter genes. If either of the 
plasmids pBTM117c-SIMl or pGAD427-ARNT is lost from the strain 
by counterselection on SD-trp+CHX or SD-leu+CAN, respectively, 
the resulting cells are no longer able to activate the lacZ 
reporter gene and do not turn blue after incubation -in X-Gal 
solution. With the membranes from the SD-leu+CAN plate false 
positive clones expressing an auto-activating GAL4ad-LexA 
fusion protein were also detected by the S-galactosidase 
assay. 

Example 3: Generation of regular grid patterns of host 

cells expressing potentially interacting fusion proteins 

3.1 Generation of a regular grid pattern of clones from an 
interaction library in microtiter plates using automation 

To generate the well defined interaction library, the 
constructs for the expression of the fusion proteins shown in 
Figure 8 were pooled and 3 /ig of the mixture was co- 
transformed into yeast strain L40cc by the method of Schiestel 
& Gietz (1989) . The yeast cells co- transformed with the 
constructs described in Table 2 were plated onto large 24 x 24 
cm agar trays (Genet ix, UK) containing minimal medium lacking 
tryptophan leucine and histidine (SD-leu-trp-his) . The agar 
trays were poured using an agar- autoclave and pump (Integra, 
Switzerland) to minimise tray- to- tray variation in agar colour 
and depth. To maximise the efficiency of automated picking, 
the transformation mixture were plated such that between 200 
and 2000 colonies per agar tray were obtained after incubation 
at 30°C for 4 to 7 days. 
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Suitable changes to the hardware and software of a standard 
picking robot designed for the picking of E. coli cells as 
described by Lehrach et al. (1997) were made to account for 
the specific requirements of yeast cells. The illumination of 
agar- trays containing plated colonies was changed from the 
dark- field sub- illumination to dark- field top- illumination to 
differentiate yeast colonies from the lawn of non- trans formed 
cells. The existing vision guided motion system (Krishnaswamy 
& Agapakis 1997) was modified to allow for a larger range of 
"blob 1 size when selecting yeast colonies to pick from the 
blob- feature- table returned by connectivity algorithms when 
applied to a digital image of the agar tray containing 
colonies. The clone inoculation routine was re-programmed to 
ensure that cell material which had dried on the picking pins 
during the picking routine was initially re-hydrated by 10 
seconds of immersion in the wells of a microtiter plate before 
vigorous pin-motion within the well. This robotic procedure 
ensured that sufficient cell material was inoculated from each 
picking pin into an individual well of a microtiter plate. The 
picking pins were sterilised after inoculation to allow the 
picking cycle to be repeated by programming the robot to brush 
the picking pins in a 0.3% (v/v) solution of hydrogen 
peroxide, followed by a 70% ethanol rinse from a second wash- 
bath and finally drying by use of a heat -gun to evaporate any 
remaining ethanol from the pins. Furthermore, an algorithm to 
automatically correct for height variation in the agar was 
incorporated by referencing the surface height of the agar in 
three corners and from these points automatically estimating 
the surface plane of the agar. The robot was further 
programmed to automatically adjust both the imaging and 
picking heights according to the agar surface height such that 
when a pin was extended into a colony, it removed cells only 
from the top surface of the colony and did not penetrate the 
whole colony into the growth medium. Finally, we incorporated 
additional selection criteria that would reliably sort between 
blue and white colonies . Although the robot provided a method 
to select only those "blobs' (colonies) within a range of 
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average grey scales (eg, > 80 for white colonies) , this proved 
unreliable since the actual value of average grey scale 
required to make a correct discrimination varied across the 
agar- tray due to slight variation in intensity of the 
illumination. Therefore, a new method was implemented that 
automatically modified this discrimination value based on the 
average illumination of a region of the agar- tray as measured 
by the camera on a frame-to-frame basis. Often, a "blue 1 
colony that activated the readout system was not uniformly 
blue across the its whole area, but only the centre would be 
blue and the surrounding cell material was white. In such 
cases, the connectivity algorithms would detect two "blobs' - 
one (the blue centre) lying directly on the other (the white 
surrounding) and although the former would be ignored since it 
was blue, the latter would be selected as its average grey- 
scale was greater than the discrimination value. Such cases 
were successfully selected against by ignoring any colonies 
that had "holes' using a "number of holes' function of the 
image analysis program, which flags those blobs which have a 
second blob within their boundary. 

Using these modifications to a laboratory picking robot, 
individual yeast colonies were automatically picked from the 
agar- trays into individual wells of a sterile 3 84 -well 
microtiter plate (Genetix, UK) containing sterile liquid 
minimal medium lacking leucine and trptophan (SD-leu-trp) and 
containing 7% (v/v) glycerol. The resulting microtiter plates 
were incubated at 30°C for 36 hours, the settled colonies were 
dispersed by vigorous mixing using a 384-well plastic 
replicating tool (Genetix, UK) and then incubated for a 
further 2 to 4 days. A picking success of over 90% wells 
containing a growing yeast culture was achieved. After growth 
of yeast strains within the microtiter plates, each plate was 
labelled with a unique number and barcode. Each plate was also 
replicated to create two additional copies using a sterile 
384-pin plastic replicator (Genetix, UK) to transfer a small 
amount of cell material from each well into pre-labelled 384- 
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well microtiter plates and pre-filled with SD-leu-trp-his/7% 
glycerol liquid medium. The replicated plates were incubated 
at 30 °C for 3 days with a cell dispersal step after 36 hours, 
subsequently frozen and stored at -70°C together with the 
original picked microtiter plates of the interaction library. 

In this manner, a regular grid pattern of yeast cells 
expressing potentially interacting yeast clones was generated 
using a robotic and automated picking system. 3 84 -well 
microtiter plates have a well every 4.5 mm in a 16 by 24 well 
arrangement. Therefore, for each 3 84 -well microtiter plate a 
regular grid pattern at a density greater that 4 clones per 
square centimetre was automatically created. 

3.2 Creation of regular grid patters of increased density 

To generate arrays with higher densities, a computer- 
controlled 96-well pipetting system (Opal-Jena) with automatic 
plate-stacking, tip washing, liquid waste and accurate x-y 
positioning of the microtiter plate currently accessed by the 
tips was employed. The yeast two hybrid cells that had settled 
in the bottom of the wells of the arrayed interaction library 
as described above were re-suspended, and a stack of these 
3 84 -well plates were placed into the input stacker of the 
pipetting system. The system was programmed to take a single 
3 84 -well microtiter plate containing the arrayed yeast two- 
hybrid clones and parallel aspirate 10 /il of culture medium 
and cells into each of the 96 pipette tips from 96 wells of 
the 384-well plate. The inter-tip spacing of the 96 tips was 
9mm and the wells of the 3 84-well microtiter plate were 4.5 mm 
so that cells were removed from only every other well along 
each dimension of the 384-well plate. 8 fil of the 96 aspirated 
samples contained in the tips were then pipetted in parallel 
into one set of wells of a sterile 1536-well microtiter plate 
(Greiner, Germany) . Since the inter-well spacing of this 1536- 
well microtiter plate is 2.25 mm, yeast cells were deposited 
into only 1 every 4 wells along each dimension of the 1536- 
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well plate. The remaining 2 /zl of culture medium and cells was 
aspirated to waste before sterilising each 96 tips in 
parallel. Sterilisation was conducted by twice aspirating and 
washing to waste 50 \il of 0.3% (v/v) hydrogen peroxide stored 
in a first replenishable wash-bath on the system, and then 
aspirating and washing to waste 50 /il sterile distilled water 
stored in a second replenishable wash-bath. 

This plate-to-plate pipetting cycle was repeated 3 further 
times, each time aspirating a different set of 96 -clones from 
the 3 84 -well array of yeast 2 -hybrid clones into a different 
set of 96-wells in the 1536-well microtiter plate by moving 
the microtiter plates relative to the 96-tips using the 
accurate x-y positioning of the system. When all clones of the 
first 3 84 -well microtiter plate had been sampled and arrayed 
into the 1536-well plate, the first 384-well microtiter plate 
was automatically exchanged for the next 384-well microtiter 
plate, and the yeast 2 -hybrid clones arrayed in this second 
384-well plate were similarly arrayed into the 1536-well 
plate. When the yeast 2 -hybrid clones contained within four 
3 84-well microtiter plates had been automatically arrayed in 
the first 1536-well plate, filling all wells, the 1536-well 
plate was automatically exchanged for a second sterile 1536- 
well plate stored in the second stacking unit of the pipetting 
system. The whole process was repeated until all yeast 2- 
hybrid clones of the interaction library had been 
automatically transferred form 384-well to 1536-well 
microtiter plates. 

In this manner, a regular grid pattern of yeast cells 
expressing potentially interacting yeast clones using a 
computer- controlled pipetting system was generated. 1536-well 
microtiter plates have a well every 2.25 mm in a 32 by 48 well 
arrangement. Therefore, for each 1536-well microtiter plate we 
automatically created a regular grid pattern at a density 
greater than 19 clones per square centimetre. 
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3.3 Generation of a regular grid pattern of clones from an 
interaction library on porous carriers using automation 

A high- throughput spotting robot such as that described by 
Lehrach et al. (1997) was used to construct porous planar 
carriers with a high-density regular grid-pattern of yeast 
clones from the defined interaction library contained within 
384-well microtiter plates. The robot recorded the position of 
individual clones in the high-density grid-pattern by the use 
of a pre-defined duplicate spotting pattern and the barcode of 
the microtiter plate. Individually numbered membrane sheets 
sized 222 x 80 mm (Hybond N+, Amersham UK) were pre- soaked in 
SD-leu-trp-his medium, carefully laid on a sheet of 3mm filter 
paper (Whatmann) pre- soaked in the same medium and placed in 
the bed of the robot. The interaction library was 
automatically arrayed as replica copies onto the membranes 
using a 384-pin spotting tool affixed to the robot. Five 
different microtiter plates from the first copy of the 
interaction library were replica spotted in a "3x3 duplicate 1 
pattern around a central ink guide- spot onto 10 nylon 
membranes - corresponding to approximately 1900 clones spotted 
at a density of approximately 40 spots per cm 2 . On each 
replica membrane three different control clones were spotted, 
each from a microtiter plate that contained the same control 
clone in every well. One control clone expressed the fusion 
proteins LexA-SIMl & GAL4ad-ARNT, a second control clone the 
fusion protein LexA-HIPl, while a third expressed fusion 
protein GAL4 ad - LexA , and all were spotted in order to test the 
selection, counterselection and the S-gal assay features of 
the method. To ensure the number of yeast cells on each spot 
was sufficient for those membranes which were to be placed on 
the counterselection media plates, the robot was programmed to 
spot onto each spot position 5 times from a slightly different 
position within the wells of the microtiter plates. The robot 
created a data- file in which the spotting pattern produced and 
the barcode that had been automatically read from each 
microtiter plate was recorded. 
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Each membrane was carefully laid onto approximately 300 ml of 
solid agar media in 24 x 24 cm agar-trays. Six membranes were 
transferred to SD-leu-trp-his media and two each of the 
remaining membranes were transferred to either SD-trp+CHX or 
SD-leu+CAN media. The yeast colonies were allowed to grow on 
the surface of the membrane by incubation at 30 °C for 3 days. 

3.4 Generation of a regular grid pattern of clones from an 
interaction library on non-porous carriers using automation 

The plasmid pGNGl (MoBiTec, Germany) carries a green 
fluorescent protein variant under the control of a LexA 
operator. This variant, GFPuv, is up to 16 times brighter that 
the wild- type variant isolated from Aequora victoria (Ausubel 
et al., 1995; Short protocols in molecular biology, 3 rd ed. 
John Wiley & Sons, New York, NY.) . The yeast 2um origin of 
replication and the auxotrophic marker URA3 maintains the 
plasmid in ura3 mutant yeast strains. This plasmid should act 
as a readout system to detect single fusion proteins or 
interacting fusion proteins able to activate the readout 
system in the method of invention described herein. As is 
known in the art, green fluorescent proteins and its variants 
are considered suitable reporter genes in most host-cell 
types. Therefore, it would be possible for a person skilled in 
the art to incorporate this gene within other host-cell types 
and interaction systems as disclosed in this invention. 

The yeast strain L40ccu was transformed with the plasmid pGNGl 
(MoBiTec, Germany) using the method of Schistel & Gietz 
(1989), and a resulting stable transformant clone cultured in 
minimal medium lacking uracil and subsequently used to 
generate two further yeast clones, each containing two genetic 
elements. The first strain, GNGp, was generated by co- 
transformation of a mixture of the plasmids pBTM117c-SIMl and 
PGAD427-ARNT co- transformed into L40ccu already carrying the 
reporter plasmid pGNGl . The second strain, GNGn, was generated 
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by co -trans format ion of a mixture of the plasmids pBTM117c-MJD 
and pGAD427-14-3-3 co- transformed into L40ccu already carrying 
the reporter plasmid pGNGl . In both cases, the transformations 
were conducted using the method of Schistel & Gietz (1989) , 
and transf ormants were selected by plating on minimal media 
lacking uracil, trptophan and leucine. 

Individual colonies from the two transformations were picked 
into individual wells of 384-well microtiter plates as 
described in section 3.1 except that the microtiter plates 
contained liquid minimal medium lacking uracil, tryptophan and 
leucine. One microtiter plate was created that contained 
individual colonies of the GNGp yeast strain, and another 
carrying colonies of GNGn. Using a spotting robot (Lehrach et 
al., 1997) fitted with high precision spotting tool carrying 
16 pins in a 4 x 4 pattern, the clones were arrayed onto poly- 
lysine coated glass-slide (Sigma, US) . The clones were spotted 
at a spacing of 440 urn, with a spot diameter of approximately 
300 urn generating a density of over 490 clones per square 
centimetre. To increase the amount of cell material depositied 
at each spot, the robot was programmed to spot onto each spot 
position 10 times from a slightly different position within 
the wells of the microtiter plates. It is well known in the 
art that piezo-ink- jet micropipetting systems (Kietzmann et 
al., 1997, Schober et al . , 1993) can create regular grid 
pattern of clones at an even greater density. Indeed, grid 
densities of over 1600 spots per quare centimeter have been 
achieved with such systems. 

The fluorescent readout system of cells in the regular grid 
pattern of cells was then visualised using a sensitive CCD 
camera (LAS1000, Fuji, Japan). Appropriate excitation light 
was provided and an emission filter appropriate for the 
emission spectrum of GFP UV was fitted to the lens. Other 
imaging systems could be utilised to investigate the regular 
grid pattern of clones. For example, laser- scanning systems 
including laser scanning confocal microscopes would be 
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preferred when imaging very high density regular grid 
patterns, or for those formed from a small number of host 
cells deposited at each position. 

It was shown that the fusion proteins LexA-SIMl and GAL4ad- 
ARNT can interact and activate a readout system under control 
of the LexA operator. Since the GNG UV reporter gene is under 
the control of a LexA operator, a cell carrying the pGNGl 
plasmid and expressing these fusion proteins should fluoresce 
under UV light. In contrast, the fusion proteins LexA-MJD and 
GAL4-14-3-3 were shown unable to activate the same readout 
system. Image analysis of the digital image of the regular 
grid pattern of yeast cells, demonstrated that indeed, the 
GNGp yeast strain did fluoresce while the GNGn did not. 

As an alternative to pGNGl a person skilled in the art could 
subclone an improved GFP mutant as described in Anderson et 
al. (1996) . Replacement of the URA coding sequence in pLUA 
(section) with GFP is performed by using appropriate anchor 
primer to amplify the GFP mutant. Using the appropriate growth 
media the analysis can be performed as described above. 

Example 4: Detection of the readout system in a regular 
grid pattern. 

4.1 Detection of readout system activation in a regular grid 
pattern of clones from an interaction library on planar 
carriers using digital image capture, processing and analysis 

Two membranes from each of the selective media described in 
section 3.3 were assayed for lacZ expression using the S-gal 
assay as described by Breeden & Nasmyth (1985) and air dried 
overnight. For each membrane, a 24-bit digital BMP (bitmap) 
image of the S-gal assay was captured using a standard A3 
computer scanner, and the images were stored on computer. The 
yeast strain used to create the defined interaction library 
was an ade2 auxotrophic mutant, and those colonies that grew 
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yet did not activate the readout system were pink in colour 
when mature. Since image analysis programs used for the 
analysis of DNA grids use single channel (grey- scale) images, 
it was necessary to convert this colour image to an 8 -bit 
grey-scale image. However, the pink colour of colonies not 
expressing the S-gal reporter gene, when converted to grey- 
scale, would lower the contrast between positive and negative 
activation states of the readout system. Therefore, the pink- 
red colours of the image were re-mapped to light yellow before 
processing the remapped 24 -bit colour image to a colour- 
inverted 8 -bit grey- scale TIF (tagged image file format) using 
the software Photo Magic (Micrograf ix, USA) . One non- inverted 
8 -bit grey- scale image of the defined interaction library that 
was grown on membranes placed on each of the 3 selective media 
and subsequently assayed for (3-gal activity is shown in Figure 
10. 

Individual clones of the interaction library can be identified 
and their position on the high-density spotted filter 
converted to specific wells in the microtiter plates using an 
automated image analysis system as described by Lehrach et al. 
(1997) . Here, the basic grid and node position of each clone 
is determined through an iterative sampling scheme proposed by 
Geman & Geman (1984) . Once the node positions have been 
determined, the average grey-scale value of a pixel mask 
appropriately sized for the average colony diameter is 
recorded from the image for every colony on the filter. From 
these intensity data, global and block-specific background 
corrections are made, giving greater weight to the local 
block-specific background. Each colony is then classified into 
one of four £-galactosidase activities by appropriate binning 
values of the background- corrected intensities. 

Positive clones that expressed interacting fusion proteins 
were detected from false positive clones by considering the 
activity of S-galactosidase of clones grown on spotted 
membranes laid on the various selective media. Positive clones 
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should activate the lacZ reporter gene on SD-leu-trp-his media 
and turn blue on incubation with X-Gal solution, but not on 
either of the two counterselective media. False positive 
clones should activate the reporter gene and turn blue on 
incubation with X-Gal solution on at least one 
counterselective media as well as on the SD-leu-trp-his 
medium. 

Figure 11 shows magnified images of a S-gal assay of clones 
grown on the membranes which had been placed on the three 
selective media. Within the magnified region of the -membranes 
shown in Figure 11a, two clones were detected as positive 
clones that express interacting fusion proteins since they 
activated the lacZ reporter gene on SD-leu-trp-his media, but 
not on either of the two counterselective media, and whose 
spotted positions are circled. The two clones were identified 
by their microtiter plate address within the interaction 
library as 06L22 and 08N24 respectively. All other clones 
spotted within this region of the membrane were detected as 
false positive since they express S-galactosidase on SD- 
trp+CHX medium as well as on SD-leu-trp-his medium. 

Expression of the LacZ reporter gene for the three control 
clones spotted onto the same membranes confirm these results. 
The positive control clone that expresses the interacting 
fusion proteins LexA-SIMl & G AL4 ad - ARNT should show a LacZ* 
phenotype when grown on SD-leu-trp-his medium, but LacZ- when 
grown on either of the counterselective media. This control 
clone was spotted at position 03 in the region of the 
membranes shown in Figure lib, of which one example is 
circled. The pattern of E-gal activity for this positive 
control clone on the three selective media is as predicted. 
The false positive control clone that expresses the fusion 
protein LexA-HIPl and the false positive clone that expresses 
the fusion protein GAL4ad-LexA are spotted at positions 02 and 
01 respectively. Both false positive control clones show a 
LacZ+ phenotype when grown on SD-leu-trp-his media, but are 
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detected as false positive clones by the method of the 
invention since they also show a LacZ+ phenotype on SD-leu+CAN 
or SD-trp+CHX media, respectively. The clones spotted at 
position 04 are from the defined interaction library, and from 
their LacZ + phenotype when grown on SD-leu+CAN media are 
predicted to be false positive clones. 

The image analysis system described above was used to 
automatically identify those individual clones on each high- 
density regular grid pattern that had activated the LacZ 
readout system. This was conducted for each of the membranes 
grown on the three selective media, and the intensity of S- 
galactosidase activity for each clone grown on the three media 
was automatically recorded by the program using a scale from 0 
to 3 (no activity, weak activity, medium activity, high 
activity) . These data for all clones on a given membrane were 
saved in a computer file, and the S-galactosidase activity for 
a given clone was related to its activity when grown on the 
other two selective media using a computer program. This 
program was used to query and identify all clones from the 
interaction library that had activated the reporter gene when 
grown on SD-leu- trp-his (score greater than 0) , yet had not on 
either of the counterselective media (score on both media 
equal to 0) . Figure 12a shows a subset of these clones 
identified using this data-query procedure, and Figure 12b 
shows that the two clones 06L22 and 08N24 are found within 
this automatically identified data-set of positive clones. 

4.2 Detection of readout system activation in a regular grid 
pattern of clones from an interaction library in microtiter 
plates using digital image capture, processing and analysis. 

The interaction library comprising the yeast cells as 
described in section 3.1 were screened in microtiter plate 
format to identify those cells that express interacting fusion 
proteins. First, microtiter plates containing the interaction 
library were removed from frozen storage and thawed to room 
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temperature. Second, each plate was replicated and labelled as 
described in section 3,1 to create additional copies for 
screening, each into 3 separate selective media. Cells were 
transferred into 384-well microtiter plates pre-filled with 40 
ul of the liquid selective media SD-leu-trp, SD-leu+Can or SD- 
trp+CHX. Third, after growth for 4 days at 30 °C, 10 ul of 
Yeast One Step Yeast Lysis Buffer containing Galacton-Star and 
Sapphire II (Tropix, US) was added, the cells were dispersed 
using a plastic replication tool, and the plates incubated for 
40 min at 37°C. Finally, a digital image of six plates was 
obtained in parallel using a LAS1000 CCD camera (Fuji, Japan), 
by placing the plates side-by- side in a two by three 
arrangement. The S-galactisidase substrate, Galacton-Star in 
combination with Sapphire II (Tropix, US) generates detectable 
luminescent light on activation of the E-gal reporter gene in 
the yeast -cells, and an exposure time of 5 minutes was used to 
collect sufficient signal. The grey-scale digital images were 
captured, saved on computer and subsequently analysed using 
the image analysis system described in section 4.1. However, 
in this case, the position of each clone was far simpler to 
determine due to the lower density of the regular grid pattern 
of clones in the microtiter plate. Second, the size of the 
pixel mask used to measure the average pixel intensity was 
approximately that of the size of the microtiter plate well. 
Positive clones in the six microtiter plates were identified 
by image analysis of the digital images from clones grown in 
the three selective media, and these data processed by the 
computer program as described in section 4.1. 

Example 5: Identification of individual members of the 

interaction 

The interaction library constructed for this example was 
composed of known fusion proteins with predicted interactions 
as shown in Figure 8. A real positive clone from this defined 
interaction library is therefore expected to express the 
interacting fusion protein-pairs LexA-SIMl & GAL4 ad - ARNT , 
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LexA-HD1.6 & GAL4ad-HIPl or LexA-HD3 . 6 & GAL4ad-HIPl and hence 
contain the corresponding pairs of plasmid constructs 
pBTM117c-SIMl & pGAD427-ARNT, pBTM117c-HDl . 6 & pGAD427-HIPl or 
pBTM117c-HD3 . 6 & pGAD427-HIPl , respectively. The 
identification of individual members that comprise an 
interaction between fusion proteins that are expressed within 
a single cell can be made by a variety of means as outlined in 
Figure 1, Figure 6 and Figure 7. Three independent methods, 
nucleic acid hybridisation, PCR and DNA sequencing were used 
to identify the individual plasmid constructs that expressed 
the interacting fusion proteins in the positive clones 06L22 
and 08N24 . 

5.1 Identification of individual members of the interaction 
by nucleic acid hybridisation 

The four membranes which had been placed on the SD-leu-trp-his 
medium and had not been used to assay S-gal activity were 
processed according to the procedure described in Larin & 
Lehrach (1990) in order to affix the DNA contained within the 
clones of the interaction library onto the surface of the 
membrane. A l.l kb DNA fragment of SIMl and a 1.3 kb DNA 
fragment of ARNT were radioactively labeled by standard random 
priming procedures for use as a hybridisation probe (Feinberg 
St Vogelstein, 1983) . Each probe was heat denatured for 10 min 
at 95 °C and hybridised overnight at 65 °C in 15 ml of 5% 
SDS/0.5M sodium phosphate (pH 7.2)/l mM EDTA with a high- 
density spotted membrane with DNA from the interaction library 
affixed to it as prepared above. The membranes were washed 
once in 40mM sodium phosphate/0 . 1%SDS for 20 min at room 
temperature and once for 20 min at 65 °C before wrapping each 
membrane in Saran wrap and exposing it overnight to a 
phosphor- storage screens (Molecular Dynamics, USA) . A digital 
image of each hybridised membrane was obtained by scanning the 
phosphor- storage screen using a phosphor -imager (Molecular 
Dynamics, USA) . The digital image was stored on computer and 
was analyzed using the image analysis system for the analysis 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



62 

of DNA arrays as described in Lehrach et al., 1997 which 
marked positive hybridisation signals with square blocks. 
Figure 13 shows a magnified region of each hybridised membrane 
corresponding to that shown in Figure 11a containing the 
clones 06L22 and 08N24, the spotting position of which are 
circled. These clones were predicted to express either the 
interacting fusion protein-pairs LexA-SIMl & GAL4adARNT, LexA- 
HD1.6 & GAL4ad-HIPl or LexA-HD3.6 & GAL4ad-HIPl , and 
hybridisation with the specific SIM1 and ARNT probes have 
shown that both clones contain the plasmid constructs 
pBTN117c-SIMl and pGAD4 2 7 - ARNT . 

5.2 Identification of the individual members of the 
interaction by nucleic acid amplification and sequencing 

The individual clone 06L22 was recovered from the frozen 
plates of the original interaction library and inoculated into 
SD-leu-trp-his liquid medium. This culture was allowed to grow 
for 3 days at 3 0 °C and the corresponding plasmids contained 
in the clone were isolated using a QiaPrep (Qiagen, Hilden) 
procedure. Duplex PCR was used to simultaneously amplify the 
inserts contained within the plasmid constructs using primer- 
pairs specific for either the pBTM117 or pGAD427 plasmids. The 
presence of the SIM1 and ARNT inserts was confirmed for clone 
06L22 by electrophoresis of the amplified PCR products against 
separate control amplifications of the inserts from plasmids 
pBTM117c-SIMl and pGAD427-ARNT as size markers (Figure 14) . 

PCR of the individual inserts from individual plasmids carried 
by clone 06L22 was conducted as above except by using only the 
respective primer pair for the required plasmid. The 
individual inserts were also amplified directly from the yeast 
culture using a Whole Cell Yeast PCR Kit (Bio 101, USA) . The 
pairs of inserts isolated from clone 06L22 either by 
amplification from the extracted plasmid DNA or by direct PCR 
of the yeast clone were subjected to DNA sequencing by 
standard protocols. 
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The 1.26Kb inserts amplified using the primers specific to 
plasmid pBTMH7 were confirmed as the expected fragment of the 
SIM1 gene by comparison of the known sequence for this gene 
(Probst et al., 1997). Likewise, the 1.37Kb inserts amplified 
using the primers specific to the pGAD427 plasmid were 
confirmed as the expected fragment of the ARNT gene. 

Example 6: Detection and identification of interacting 
proteins using a large-scale and automated application of the 
improved 2 -hybrid system 

A scheme utilizing the method of the invention within a large- 
scale and automated approach for the parallel detection of 
clones that express interacting fusion proteins and the 
identification of members comprising the interactions is shown 
in Figure 6. Yeast clones from an "interaction library 1 that 
express interacting proteins are identified on a large-scale 
by the use of visual inspection or digital image processing 
and analysis of high-density gridded membranes on which their 
S-galactosidase activity has been assayed after growth on 
various selective media. Automated methods as described in 
earlier examples are used to effect the production of the 
interaction library and high-density spotted membranes, and 
the analysis of digital images of the S-gal assay and 
hybridisation images. 

6.1 Generation of an interaction library for a higher 
Eukaryote 

A random-primed and size selected (l-l.5Kb) cDNA library of 
40-hour post fertilisation Sea Urchin embryos 
{Strongylocentrotus purpuratus) cloned into the Not 1/Sal l 
sites of pSportl by standard procedures {Life Technologies, 
USA) was obtained as a gift from A.Poustka. 100 ng of this 
library, representing the estimated 6000 different transcripts 
expressed at this developmental stage (Davidson, 1986), was 
transformed into electro-competent E.coli cells by standard 
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electroporation techniques. Recombinant clones were selected 
by plating the transformation mixture on 2xYT/100 /xg/ml 
amplicillin contained in 24 x 24 cm agar- trays (Genet ix, UK) . 
After growth for 18 hours at 37 °C, the resulting recombinant 
colonies (estimated to be 20,000 per tray) were washed from 
the 5 trays using 50 ml of LB liquid media for each tray. The 
amplified cDNA library cloned into pSport was isolated from 
this wash mixture by a QiaPrep (Qiagen, Germany) plasmid 
extraction procedure. Approximately 1 /xg of the library 
inserts were then isolated from the plasmid DNA by Not 1/Sal 1 
digestion and size selected (1 § 1.5Kb) by agarose gel 
purification using standard procedures. 

Two pools representing all three reading frames of the two 
vector series pGAD428 and pBTM118 were prepared by Not 1/Sal 1 
digestion and pooling of 1 /xg each of vectors pGAD428 a, b & c 
and pBTM118 a, b & c respectively. The insert mixture that was 
isolated as above was split into two equal fractions and 3 00 
ng was ligated with 50 ng of each prepared vector-series pool. 
Following ligation, each reaction was then separately 
transformed into electro-competent E.coli cells, and 
recombinant clones for each library were selected on five 24 x 
24 cm plates using kanamycin or ampicillin for the pGAD428 or 
pBTM118 libraries respectively. Approximately 500 /xg of the 
PBTM118 and 500 /xg of the pGAD428 libraries was extracted from 
the two sets of E.coli transf ormants by washing off the plated 
cells and a subsequent QiaPrep plasmid extraction of the wash 
mixture as described above . 

To generate the interaction library, molar-equivalent amounts 
of the DNA binding and activation domain libraries were 
pooled, and 20 /xg of this mixture was co- transf ormed into the 
yeast strain L40cc by the method of Gietz et al . (1992) . The 
resulting transformation mix was plated on a single 24 x 24 cm 
agar tray. The agar- trays were prepared as described in 
section 1.3.1. A total of twenty transformations were prepared 
and plated onto separate agar trays yielding an average of 
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1500 yeast colonies per tray after 7 days of incubation at 
30°C. 

6.2 Creation of a regular grid-pattern of an interaction 
library in microtiter plates 

To create a regular grid-pattern of the interaction library, 
the agar- trays containing yeast colonies were placed in the 
modified laboratory picking robot and individual clones were 
automatically picked as described in section 3.1. A total of 
30 3 84 -well microtiter plates were generated and represented 
an interaction library of greater than 10,000 clones for the 
study organism. After growth of yeast clones in the wells of 
the microtiter plate, the library was replicated to generate 3 
further copies, labelled and all copies were stored at -70°C 
to provide for analysis at a later date as described in 
section 3.1. 

6.3 Creation of a regular grid-pattern of an interaction 
library on planar carriers 

To provide for efficient analysis of the interaction library, 
the clones comprising it were arrayed at high density on 222 x 
222 mm porous membranes (Hybond N+, Amersham, UK) using the 
method described in section 3.3. A total of twenty replica 
membranes, each arrayed in a "3 x 3 duplicate' regular grid- 
pattern of clones using 23 384-well microtiter plates from a 
thawed copy of the stored interaction library were produced. 
On each replica membrane, one microtiter plate was aditionally 
arrayed in position 24 containing 8 different control clones 
representing known positive, negative and false positive 
clones.. This pattern corresponded to over 9000 yeast two- 
hybrid clones spotted at a density of approximately 40 clones 

-2 

cm . To ensure the number of yeast cells on each spot was 
sufficient for the four membranes which were to be placed on 
the counterselection media plates, the robot was programmed to 
spot onto each spot position 5 times from a slightly different 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



66 

position within the wells of the microtiter plates. The robot 
created a data- file in which the spotting pattern produced and 
the barcode that had been automatically read from each 
microtiter plate was recorded. 

Each membrane was carefully laid onto approximately 300 ml of 
solid agar media in 24 x 24 cm agar-trays. Fourteen membranes 
were transferred to SD-leu-trp-his media and three each of the 
membranes which had been spotted five times were transferred 
to either SD-trp+CHX or SD-leu+CAN media. The yeast colonies 
were allowed to grow on the surface of the membrane -by 
incubation at 30 °C for 3 days. 

6.4 Detection of the readout system in a regular grid pattern 
and analysis using digital image analysis to identify positive 
clones 

To provide for the efficient identification of individual 
clones that expressed interacting fusion proteins, the 
activation state of the individual clones grown on the porous 
carriers was examined in a highly parallel manner. The replica 
arrays of the interaction library grown on the six membranes 
placed on the counterselective media, plus three further 
membranes which were placed on SD-leu-trp-his medium as 
described above, were assayed for lacZ activity, a digital 
image of each was captured and image-processed as described in 
section 1.4.1. Figure 15 shows an grey- scale image of readout 
system activation for individual clones from the interaction 
library arrayed in a regular grid-pattern on a membrane filter 
and grown on SD-leu-trp-his medium. 

The activation state of the readout system for each individual 
clone in the regular grid-pattern grown on the three selective 
media was recorded from each digital image using the image 
analysis system described in section 4.1. These data were 
collected for the interaction library grown on three replica- 
membranes for each of the selective media SD-leu-trp-his, SD- 
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leu+CAN St SD-trp+CHX, and was . related together for each 
individual clone using the computer program shown in Figure 
12a. 

This program was used to query these data and identify those 
clones that had activated the readout system when grown on two 
out of three SD-leu-trp-his replica membranes, but not whefi 
grown on any of the two sets of three replica membranes placed 
on the two counterselective media SD- leu+CAN or SD-trp+CHX. 
The data-base correctly identified the eight different control 
clones each arrayed in 48 wells of the 24 th microtiter plate. 
A total of 7539 clones from the interaction library arrayed in 
23 384-well microtiter plates were thus identified as positive 
clones - clones that only activated the readout system when 
both plasmids (and hence fusion proteins) were expressed in 
the cell. 3983 clones were identified as false-positive clones 
as they also activated the readout system when grown on SD- 
trp+CHX medium $ the growth medium that eliminated the plasmid 
expressing the activation domain fusion protein. 113 clones 
were identified as false positive clones by activating the 
readout system when grown on SD- leu+CAN medium § the growth 
medium that eliminated the plasmid expressing the DNA binding 
fusion protein. These data were automatically made available 
to a table of the relational database holding information on 
each clone of the interaction library as described in Example 
7. 

This relatively high number of false-positive clones 
identified following SD-trp+CHX selection can be explained 
since on elimination of the activation domain plasmid, the 
DNA-binding domain fusion protein is tested for its ability to 
activate the readout system without any partner protein. It is 
known that many transcripts expressed in early Sea Urchin 
embryos are transcription factors, and that fragments of 
transcription factors can commonly cause false positives in 
the yeast two-hybrid system when expressed as the DNA-binding 
domain fusion protein. Therefore, these results demonstrate 
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that the above method can efficiently eliminate large-numbers 
of false positive clones from a large-scale library vs. 
library screen interaction screen. 

6.5 Identification of the individual members of the 
interaction by nucleic acid amplification and sequencing 

A total of 96 positive clones were randomly selected from the 
database and recovered from a frozen copy of the interaction 
library clones stored in 384-well microtiter plates. The DNA 
sequences cloned into the pGAD428 and pBTM118 vectors carried 
by each clone were directly amplified as described in section 
5.2 except that the direct PCR reactions were conducted in 96- 
well microtiter plates using a high-thoughput water-bath 
thermocycling machine (Maier et al . , 1994) . 

Standard sequencing approaches were employed to characterise 
the nucleic acids encoding the DNA-binding domain fusion 
proteins of the positive clones following pBTM42 8- specific 
96-well PCR as described above. Similarly, the sequence of the 
insert encoding for the activation-domain fusion protein 
following pGAD118 -specific PCR was determined. Sequence 
comparison of these insets against published DNA sequences 
using standard sequence comparison tools (e.g. BAST), 
identified that one interaction involved two previously 
unidentified gene fragments that were expressed by the 
positive-clone located in plate 5, well K20. From the 
predicted protein sequence these two genes were designated 
Protein A and Protein B. 

6.6 Identification of individual members of the interaction 
by nucleic acid hybridisation 

Regular grid patterns of the nucleic acids encoding the fusion 
proteins from the interaction library were constructed. The 
membranes which had been placed on the SD-leu- trp-his medium 
and had not been used to assay S-gal activity were processed 
according to the procedure described in Larin & Lehrach (1990) 
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in order to affix the DNA contained within the clones of the 
interaction library onto the surface of the membrane. The DNA 
fragment that encoded Protein A isolated as above, was 
radioactively labelled by the method of Feinberg & Vogelstein 
(1983) . This labelled probe was hybridised to an array with 
DNA from the interaction library affixed to it, and the array 
washed and detected as 5.1. 

The number and identity of hybridisation-positive clones was 
determined for each hybridisation using the automated image 
analysis system described in Lehrach et al., (1997).- Seven 
clones from the interaction library were identified as 
hybridisation-positive for the probe encoding Protein A. 
Figure 16 shows a digital image of a DNA array hybridised with 
the gene fragment encoding Protein A with the hybridisation- 
positive clones identified and marked by the automated image 
analysis system, and Figure 17 represents a graphical 
representation of the positives found by this analysis. The 
database described in Example 7 was used to refer to the list 
of clones generated by the image analysis program and identify 
those hybridisation-positive clones that were interaction- 
positive clones and hence eliminate any false positive clones 
from further analysis. As expected, a hybridisation-positive 
clone was the clone 5K2 0 from which the probe corresponding to 
Protein A was obtained. 

To extend the interaction pathway from Protein A, a second 
filter was hybridised with a radioactive labelled probe 
generated from the fragment coding for Protein B. Analysis of 
the hybridisation signals with the database described in 
Example 7 resulted in the identification of eight interaction- 
positive clones that carried the gene fragment encoding for 
Protein B. Figure 18 shows a graphical representation of the 
hybridisation-positive and interaction-positive clones 
identified with probe B (open circles) and probe A (red 
circles) . Two clones (5K20 and 3L11 marked by lf A/B n ) gave a 
hybridisation signal with both probe A and Probe B, indicating 
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that both these positive clones expressed the same interacting 
fusion proteins. 

To further extend the interaction pathways of proteins A and 
B, the DNA binding and activation domain plasmids were 
extracted from one interaction-positive clone that gave a 
hybridisation signal only with probe B (clone 6D18) . DNA 
sequencing of the inserts carried by these genetic elements 
confirmed the presence of a gene fragment encoding for Protein 
B in the DNA binding domain plasmid. Sequence analysis showed 
that the activation domain plasmid carried a fragment for 
another unknown gene coding for Protein C. This gene fragment 
was used as a probe to another array and the data analysed as 
above. Figure 19 shows the results of this hybridisation 
(marked with diamonds) , together with that from the previous 
two hybridisations. A total of six interaction-positive clones 
were identified as carrying genetic elements encoding for 
Protein C. Three of these interaction-positive clones were 
previously shown to hybridise with probe B (4G19; 1D7; 6D18) 
and two clones to hybridise with probe A (1C22; 3A11) . A 
graphical view of the interactions identified by these three 
simple hybridisations is outlined in Figure 19 . Question marks 
represent possible further steps in the network which could be 
further investigated by a similar investigation of the genetic 
elements carried by the remaining hybidisation-positive clones 
for probes A, B or C. Indeed, by following this focused 
hybridisation approach, 14 different protein-protein 
interactions were identified by a total of nine hybridisations 
and subsequent sequencing of the inserts encoding the 
interacting members. All these data were enteredinto the data- 
base described in Example 7. 

6.7 Automated rearraying of positive clones 

The 3443 positive clones identified as described above were 
distributed across all 23 microtiter plates of the interaction 
library. To greatly facilitate further analysis of positive 
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clones, it was advantageous to individually physically isolate 
clones and to generate a second, re-arrayed regular grid- 
pattern of positive clones, preferably within a further set of 
384-well plates. 

Existing rearraying robots such as described by Stanton et al, 
(1996), Lehrach et al., (1997) or those sold by commercial 
sources (Genetix, UK) system failed to provide a satisfactory 
inoculate when transferring yeast cells from individual wells 
of a source ("mother') 384-well plate containing the original 
interaction library in wells of a new, sterile 384-well 
destination ("daughter 1 ) plate containing growth medium. 
Therefore, the existing transfer pins were replaced by 
straight 2 mm diameter pins that terminated in a flat end. 
Secondly, the inoculation procedure was modified to maximise 
the amount of dried cell material carried on the pin that was 
transferred into the new well within the daughter plate as 
described for automated picking of yeast colonies in section 
3.1. The pins were sterilised between rearraying cycles by a 
0.3% hydrogen peroxide wash-bath, 70% ethanol wash-bath and 
heat-drying procedure as described in section 3.1. 

The list of positive clones, together with their plate-well 
location was generated from the data-base described in Example 
7 and automatically loaded as a computer file onto the 
rearraying robot. The robot automatically took the mother 
plate containing the first positive yeast two-hybrid clone by 
reference to the data file and read and recorded the barcode 
of the plate. Individual and sequential pins of the 96 -pin 
rearraying head were positioned above and lowered into the 
required wells from this first plate, and the mother plate was 
automatically exchanged when all positive clones had been 
sampled. When all 96-pins had been used to collect inoculates 
of positive clones, the head was automatically moved over to 
the first 384-well daughter plate containing SD-leu- 
trp/7%glycerol and inoculated all 96-pins in the first set of 
wells as described above. A data output file was then updated 
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which related the new plate-well location of a given positive 
clone in the re-arrayed library to its old plate-well location 
in the original interaction library. All pins were then 
sterilised as described, and the cycle completed until all 
positives clones had been transferred from the interaction 
library to a new plate-well location comprising the re-arrayed 
library. The data output file was then transferred to the 
central computer database to append a table in the data-base 
described in Example 7 to record the correct location of a 
given positive clone in the re-arrayed interaction library. 
The resulting clones in the daughter plates were replicated 
into two further copies and stored at $70 °C as described in 
section 3.1. 

Example 7: Generation of a data-base of interactions. 

Central to the scheme (Figure 2) is a data- table holding 
relevant information on each member of an interaction - the 
cDNA-Table - where a separate record in the table represents 
each member of an interaction, and members are indicated to 
form interactions by sharing the same clone name. It is 
advantageous to structure the core data- table in this way for 
several reasons. First, the same core table can be used to 
hold data on cDNAs from different kinds of genetic libraries 
(for example, standard cDNA or genomic libraries) which can be 
generated during a global analysis using various genomic 
techniques, not just interaction data. Secondly, each of the 
members of an interaction, or genetic fragments may be further 
characterised by a number of ways for different sets of data. 
Of direct relevance to protein-protein interaction for a given 
genetic fragment in the cDNAJTable is first, the GeneJTable, 
which provides a direct relationship to the fragment ' s DNA 
sequence, nucleotide homology match (for example through BLAST 
searching) and the corresponding gene name. Second, the 
Domain_Table provides facility to directly access data of the 
fragment's in- frame translation, amino acid homology match 
(for example through BLASTN searching) and any 2 or 3- 
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dimensional structural information which may be known or can 
be predicted. As is commonly known in molecular biology, there 
are many ways in which a given genetic fragment may be 
characterised, and this data-base structure provides the 
facility to relate from the central cDNAJTable to any other 
table holding data describing said characterisation as may be 
appropriate. For example, those holding data on genetic, 
expression, target validation, protein biochemistry or library 
construction information. Of particular relevance to the 
method of invention, is the relationship of a given cDNA 
fragment to a table holding information on oligof ingerpriting 
data. Said oligof ingerpriting data can be used to identify 
each member of an interaction in a highly parallel manner and 
includes fields for data such as cluster number, confidence of 
cluster membership and predicted gene homology for that 
cluster (Maire et al., 1994). Third, such a data-base 
structure will more easily enable tertiary or higher order 
interactions to be incorporated within the same data table. 
This is in contrast to a structure in which interactions 
rather than members of an interaction were the basic object or 
record in a data table, and for each higher order interaction 
a new data-table would be needed or an existing data-table 
modified . 

In the case of a yeast two-hybrid interaction screen one 
related table would be the Y2HJTable. Said table may include 
information for a given clone pertaining to cloning and 
experimental details of its creation, the tissue and library 
from which it was derived, its physical location to enable 
easy access for further studies, whether it was derived from 
the mating of given Mata and Mata strains. Importantly, the 
Y2H_Table holds information pertaining to the interaction 
class of the clone § where said interaction class is defined 
as whether the clone was a positive clone, negative clone, or 
a false positive with respect to either the activation domain 
(AD) or biding domain (BD) fusion protein. The value for said 
interaction class is easily derived for a large number of 
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clones by the method of invention described in earlier 
examples . 

To assist any focused approach to identifying members 
comprising the interactions, the Hyb Table is provided. This 
table relates for a given clone, the hybridisation intensity 
obtained with a given probe in a hybridisation experiment 
using a given hign density array. Said high-density array to 
be related to tables holding data from the spotting robot such 
as the defined spotting pattern used, the method by which the 
array was produced and the identity of the library and clones 
arrayed on said array. The incorporation of these tables 
within a user interface will enable this embodiment of the 
method of invention to be easily conducted by displaying to 
the user the physical location of a given positive yeast two 
hybrid clone that hybridised to a given probe. Said two-hybrid 
clone can then be recovered, the members comprising the 
interaction isolated by PCR and sequenced. Said sequenced 
members of an interaction then provide data to be entered into 
the cDNAjTable and other related tables on further analysis. 
Said member to then be used as a second hybridisation probe 
onto an array to identify the next step in an interacting 
pathway by the same procedure. 

On collection of a substantial number of interacting members 
within the cDNA_Table, these data can be curated by manual 
and/or expert systems to update a definitive data table for 
example the PathCode_Table . Said definitive database to hold 
the highest quality information on interactions from the 
cDNAJTable, where said highest quality information on 
interactions to be those from the cDNA Table that pass a level 
of "certainty 1 as specified to the curator and/or expert 
system. To assist in the decision-making process, all relevant 
data especially that of the translated frame of the cDNA and 
corresponding protein domain is related from other tables and 
presented in a usable form to the curator and/or expert 
system. This presentation allows for easy recognition and 
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exclusion or correction of basic errors in the data such as 
poor quality sequencing, or incorrectly cloned cDNA fragments. 
These may include contaminating fragments which can be 
identified as originating from an organism which is different 
to that of the cDNA library. 

A given cDNA is entered into the PathCodeJTable only once for 
each interaction in which it is found, together with a record 
for the corresponding interacting cDNA (or cDNAs for multimer 
complexes) . However, where a cDNA has different interactions, 
for example 'with different proteins or where different protein 
domains of the cDNA interacts with different proteins, then in 
each case a different record for the cDNA is created. These 
different records are linked by a common and unique 
"Interaction ID 1 . A given interaction is represented thus only 
once in the PathCodeJTable , and is related to previous tables 
in the data-base by the host -cell clone that represents the 
interaction and the ID of each cDNA in the interaction. Said 
host-cell that represents the interaction is selected by 
consideration and curation of all host-cells and the 
interacting fragments representing said interaction held in 
the cDNAJTable. 

A set of criteria can be implemented to assist in said 
curation and selection, and to derive a measure of confidence 
for the interaction. As way of example, such criteria may have 
decreasing information value and include: First, if a given 
interaction is observed in both directions of the experiment 
ie proteinA-AD interacting with proteinB-BD, and proteinB-BD 
interacting with proteinA-DB. Second, if different examples of 
the same interaction are observed. Where different examples of 
the same interaction are defined as protein fragments of 
substantially different length and position (for example 
greater than 10% different) but from the same underlying 
protein domain and are also found to interact. Third, if the 
same examples of the same interaction are observed, for 
example by multiple cloning of the same fragments where the 
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same fragments are of substantially the same length and 
position from the same underlying protein domain. Fourth, that 
the protein domains that interact may have biological 
relevance. That is, similar domains or genes are known to 
interact from public literature, or it is known that both 
genes are expressed or likely to be expressed in the same 
cellular location. This criterion can also be used as an 
internal quality control of the library cloning, interaction 
experiment and subsequent identification of interacting 
members since every interaction experiment should identify a 
certain set of published "house-keeping interactions. 1 , and the 
identification of such interactions can be used as quality 
measure for the overall interaction experiment. 

One criterion of particular importance, is the optional 
validation of a given interaction by secondary experiments. 
For example, cDNA fragments representing the interacting 
proteins may be subcloned, and additional interaction 
experiments be conducted. Said additional interaction 
experiments may include testing each protein for interaction 
against a set of unrelated proteins to investigate the 
specificity of said interaction. Said testing may be conducted 
using the same interaction method that identified the 
interaction, for example the yeast two-hybrid, but preferable 
it is an independent method. Favoured, is where a given 
interaction is biochemically validated using methods including 
tissue co-northern, cellular co-localisation or co- 
precipitation studies. 

All these criteria are considered by the curator and/or expert 
system to assist in the decision on which cDNA fragments and 
their interactions are entered into the PathCode_Table . Other 
interactions known or published in scientific literature may 
also be entered into this data-base during the curation 
procedure, and hence a field in the table represents the 
source of this interaction being internal or an external 
reference. The PathCode table has relational links to 
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secondary or external data-bases holding data on nucleotide 
and protein sequences, and biochemical, structural, biological 
or bibliographical information. These data, representing the 
complete relationships between all tables and data-bases can 
be queried by using simple user interfaces, designed for 
example using Java, or by more complicated commands such as 
those provided by SQL. Possible queries include those to 
locate from these data interactions, pathways or networks for 
a given nucleotide or. amino acid sequence or motif, or for a 
given 3 -dimensional structure or motif. Secondly, for highly 
established networks, these data may be queried to identify a 
given pathway between two given points. It may be that some 
queries are more efficiently conducted using a substantially 
different design of the PathCodeJTable g for example by 
representing a given interaction as the underlying record 
rather than a given member of an interaction. A person skilled 
in the art would be able to transfer data from one table 
design to another using standard data-parsing systems to 
enable said more efficient conduction of queries. 

The result of these queries is displayed using graphical 
methods to enable the investigator to interpret these data 
most efficiently. Said graphical methods to include elements 
activated by mouse clicks such as hotlinks to seamlessly link 
these data with other data sources, or to query and display 
further levels of interactions. Computer-based methods of 
generating visual representations of specific interactions, 
partial or complete protein-protein interaction networks can 
be employed to automatically calculate and display the 
required interactions most efficiently. Both finding the 
network paths and calculating the optimal display of the found 
paths can be based on algorithms well known in the art of 
mathematical graph theory. For example, algorithms similar to 
those which have been employed to display other biological 
relationships such as genetic pedigrees and phylogenetic 
relationships . 
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An established computer data-base of protein interactions has 
many useful applications. For example, it may be used to 
predict the existence of new biological interactions or 
pathways, or to determine links between biological networks. 
Furthermore with this method, the function and localisation of 
previously unknown proteins can be predicted by determining 
their interaction partners. It also can be used to predict the 
response of a cell to changes in the expression of particular 
members of the networks without making a molecular, cellular 
or animal experiment. Finally, these data can be used to 
identify proteins or interactions between proteins within a 
medically relevant pathway, which are suitable for therapeutic 
intervention, diagnosis or the treatment of a disease. 

Exaanple 8: Preselection against false positive clones and 
the automated creation of a regular grid-pattern of yeast 
cells expressing a fusion protein 

8.1 Genetic pre-selection of false positive clones 

Three mating type-a yeast strains were constructed by co- 
transformation using the method of Schiestel & Gietz (1989) 
into L40ccu, of the plasmid pLUA containing the URA3 readout 
system, and either the pBTM117c, pBTM117c-SIMl or pBTM117c- 
HIPl plasmids respectively. Transf ormants that contained both 
the pLUA plasmid and one of the DNA binding domain plasmid 
were selected on SD-trp-ade medium. Three mating type-a yeast 
strains were similarly constructed by cotransf ormation into 
L40ccua of pLUA, and either the pGAD427, pGAD4 2 7 - ARNT or 
pGAD427-LexA plasmids respectively. Transf ormants that 
contained both the pLUA and one of the activation domain 
plasmids were selected on SD-leu-ade medium. The yeast strains 
thus obtained are listed in Table 3. 

The yeast strains xla, x2a and x3a were replica plated onto 
the selective media SD-trp-ade, SD-trp-ade containing 0.2% 5- 
FOA and SD-trp-ade-ura, while the yeast strains yla, y2a and 
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y3a were replica plated onto the selective media SD-leu-ade, 
SD-leu-ade containing 0.2% 5-FOA and SD-leu-ade-ura. Table 4 
shows that the two yeast strains x3a and y3a which expressed 
the fusion proteins LexA-HIPl and GAL4ad-LexA respectively 
were unable to grow on their respective media containing 5-FOA 
yet were able to grow on their respective media lacking 
uracil. In contrast, all other yeast strains that contained 
plasmids that expressed fusion proteins that were alone unable 
to activate the readout system could grow on their respective 
media containing 5-FOA, but could not grow on selective media 
lacking uracil. This indicates that it is possible to 
eliminate yeast clones that express single fusion proteins 
which auto- activate the readout system, by selection on media 
containing 5-FOA. Thus, the URA3 readout system successfully 
eliminated clones containing auto-activating fusion proteins 
prior to interaction mating. 

8.2 Creation of a regular grid pattern of genetically pre- 
selected yeast cells expressing a fusion protein 

Two defined libraries of clones that express fusion proteins 
were created. First, the yeast strain L40ccu was transformed 
with the plasmid pLUA and a resulting stable transformant 
colony cultured in minimal medium lacking adenine. Cells from 
this culture were rendered competent and transformed with 3 fig 
pooled mixture of all six pBTM117c constructs shown in Table 
2. Second, the yeast strain L40ccua was transformed with the 
plasmid pLUA and a resulting stable transformant colony 
cultured in minimal medium lacking adenine. Cells from this 
culture were rendered competent and transformed with 3 jig 
pooled mixture of all six pGAD427 constructs shown in Table 2. 
In all cases, competent cells were prepared and 
transformations conducted using the method of Schiestel & 
Gietz (1989) . 

The two transformation mixes were incubated at 30 °C for 2 
hours in 10 ml of YPD liquid medium before plating onto large 
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24 x 24 cm agar trays (Genetix, UK) . The Mata cells containing 
the pBTM117c fusion library were plated onto minimal medium 
lacking tryptophan and adenine but containing 0.2% 5-FOA (SD- 
trp-ade+FOA) , while the Mata cells containing the pGAD427 
fusion library were plated onto minimal medium lacking leucine 
and adenine but containing 0.2% 5-FOA (SD-leu-ade+FOA) . The 
agar trays were poured using an agar-autoclave and pump 
(Integra, Switzerland) to minimise tray-to-tray variation in 
agar colour and depth. After plating, the colonies were grown 
by incubating the trays at 30°C for 4 to 7 days resulting in 
approximately 1500 colonies per tray. 

Mata clones containing the plasmid pBTM117c-HIPl and Mata 
strains containing the plasmid pGAD427-LexA expressed the 
fusion proteins LexA-HIPl and GAL4ad-Lexa respectively. These 
fusion proteins were shown to activate the URA3 readout system 
without any interacting fusion protein. Therefore, cells 
carrying these plasmids should be unable to grow on selective 
media containing 5-FOA. Hence, only those yeast clones 
expressing a single fusion protein unable to activate the URA3 
reporter gene will form colonies on be picked by the modified 
robotic system. 

Using the modified laboratory picking robot, individual yeast 
colonies were automatically picked from the agar- trays into 
individual wells of a sterile 384-well microtiter plates, as 
described in section 1.3.1 except that the Mata yeast strains 
were picked into microtiter plates containing the growth 
medium SD-trp-ade and 7% (v/v) glycerol, while the Mata yeast 
strains were picked into microtiter plates containing the 
growth medium SD-leu-ade and 7% (v/v) glycerol . The resulting 
microtiter plates were incubated at 30 °C for 4 days with a 
cell-dispersal step after 36 hours section 3.1. After 
incubation, each plate was replicated to create two additional 
copies into labelled 384-well microtiter plates and pre-filled 
with the liquid growth medium containing 7% glycerol as was 
appropriate for the yeast strain. The replicated plates were 
incubated at 3 0 °C for 4 days with a cell dispersion step 
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conducted after 3 6 hours as above, subsequently frozen and 
stored at -70°C together with the original picked microtiter 
plates of the libraries of cells expressing fusion proteins. 

It will be clear that higher density regular grid-patterns of 
such an interaction library can be easily generated by a 
person skilled in the art from these microtiter plates of 
diploid yeast cells by following the methods disclosed in 
sections 3.2, 3.3 and 3.4 of this invention. 

8.3 Visual differentiation against false positives .for an 
improved yeast two -hybrid system 

Six yeast strains were generated by transforming each of the 
pBTM117c plasmid constructs described in Table 2 into L40ccu 
by the method of Schiestel & Gietz (1989). Each strain was 
plated on selective growth medium lacking tryptophan, buffered 
to pH 7.0 with potassium phosphate and containing 2 ug/ml of 
the fi-galactosidase substrate X-Gal (SD-trp/XGAL) . Six further 
strains were similarly constructed by transforming each of the 
pGAD427 plasmid constructs described in Table 2 into L40ccua. 
These strains were plated on selective growth medium lacking 
leucine, buffered to pH 7.0 with potassium phosphate and 
containing 2 ug/ml of X-Gal (SD-leu/XGAL) . After incubation at 
30 °C for 7 days, the strains were inspected for growth and 
blue colour. Table 5 shows that although all yeast strains 
were able to grow on the selective media, only the L40ccu 
strain expressing the fusion protein LexA-HIPl and the L40ccucc 
strain expressing the fusion protein GAL4ad-LexA turned blue. 
In contrast, all other yeast strains that contained plasmids 
that expressed fusion proteins unable to activate the readout 
system alone could grow on the selective media, but did not 
turn blue, it was found that for the fusion proteins described 
here, the blue-colour generated by auto-activation of the 6- 
galactosidase readout system developed faster than any pink- 
colour of other clones due to the ade2 mutation. However, the 
blue colour may develop slower than the pink colour for some 
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fusion proteins that may affect the reliability of visual 
differentiation using automated systems with grey- scale vision 
systems. Therefore, a person skilled in the art will be able 
to incorporate colour recognition systems, colour filters or 
construct a yeast strain that does not develop the pink 
colour. For example, using a strain carrying the wild- type 
ADE2 gene, or the complementary mutation ade3. 

8.4 Using automation to visually discriminate false-positive 
yeast clones and the creation of a regular grid pattern of 
cells 

Two defined fusion protein libraries were generated. Six 
pBTM117c constructs shown in Table 2 were pooled and 3 /xg of 
the mixture was co- transformed into the yeast strain L40ccu. 
The resulting transf ormants were selected by plating the 
mixture onto five large 24 x 24 cm agar-tray (Genetix, UK) 
containing minimal medium lacking tryptophan, buffered to pH 
7.0 with potassium phosphate and containing 2 ug/ml of X-Gal 
(SD-trp/XGAL) . Second, the six pGAD427 constructs shown in 
Table 5 were pooled and 3 /ig of the mixture was co- transf ormed ■ 
into the yeast strain L40ccua. The resulting transf ormants 
were selected by plating the mixture onto five large 24 x 24 
cm agar-tray (Genetix, UK) containing minimal medium lacking 
leucine, buffered to pH 7.0 with potassium phosphate and 
containing 2 ug/ml of X-Gal (SD-leu/XGAL) . These agar- trays 
were poured using an agar-autoclave and pump (Integra, 
Switzerland) to minimise tray- to- tray variation in agar colour 
and depth. The agar- trays were incubated for 7 days to allow 
the yeast clones to grow and the blue colour of clones able to 
activate the S-galactosidase reporter gene to develop. In all 
cases, competent cells were prepared and transformations 
conducted using the method of Schiestel & Gietz (1989) . 

Using the modified laboratory picking robot, individual yeast 
colonies were automatically picked from the agar- trays into 
individual wells of a sterile 384-well microtiter plates, as 
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described in section 3.1 except that the Mats, yeast strains 
were picked into microtiter plates containing the growth 
medium SD-trp and 7% (v/v) glycerol, while the Mata yeast 
strains were picked into microtiter plates containing the 
growth medium SD-leu and 7% (v/v) glycerol 

Automated visual differentiation was made by using the blue- 
white sorting parameters described in section 3.1. The robot 
was programmed to pick only white colonies into microtiter 
plates and ignore all colonies that had turned blue on 
activation of the S-galactosidase reporter gene. Figure 20 
displays automated visual discrimination of false positive 
clones using the modified picking system described above. The 
resulting microtiter plates were incubated at 30 °C for 4 days 
with a cell-dispersal step after 36 hours section 3.1. After 
incubation, each plate was replicated to create two additional 
copies into labelled 384-well microtiter plates and pre-filled 
with the liquid growth medium containing 7% glycerol as was 
appropriate for the yeast strain. The replicated plates were 
incubated at 30 °C for 4 days with a cell dispersion step 
conducted after 3 6 hours as above, subsequently frozen and 
stored at -70°C together with the original picked microtiter 
plates of the libraries of cells expressing fusion proteins. 

It will be clear that higher density regular grid-patterns of 
such an interaction library can be easily generated by a 
person skilled in the art from these microtiter plates of 
diploid yeast cells by following the methods disclosed in 
sections 3.2, 3.3 and 3.4 of this invention. 

Only those colonies that expressed the fusion protein LexA- 
HIP1 or the GAL4ad-LexA should be able to activate the LacZ 
gene and hence turn blue when grown on the selective medium. 
Therefore, blue colonies from the Mata library would be 
expected to carry the pBTM117c-HIPl construct while white 
colonies would carry other pBTM117c plasmid constructs. 
Likewise, blue colonies from the Mata library would be 
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expected to carry the pGAD427-LexA construct while white 
colonies would carry other pGAD427 plasmid constructs. To 
prove this hypothesis, 10 white and 10 blue colonies were 
randomly selected from a picked agar-tray of the Mata library, 
and twenty colonies from a 384-well microtiter plate that had 
been automatically picked from this plate. All 40 colonies 
were hand inoculated into individual 1ml liquid cultures of 
SD-trp medium and the cultures grown for 3 days at 30°C. The 
inset carried by each clone was checked by direct PCR 
amplification of the pBTM117c insert from the yeast culture 
and DNA sequencing by standard protocols. All ten yeast 
colonies that had activated the readout system and turned blue 
carried the 1.2 Kb HIP1 fragment, while the white colonies 
carried the 1.6 Kb HD1.6, the 1.1 Kb SIM insert or gave no 
amplification reaction from the non- recombinant vector. Of the 
twenty clones selected from the 384-well microtiter plate 
which had been automatically visually differentiated, none 
carried the 1.2 Kb HIP1 fragment. A similar experiment of 
clones manually selected and automatically picked from the 
Mata library confirmed that blue colonies contained the LexA 
insert from the pGAD427-LexA construct, and no automatically 
picked colonies carried this insert. The pBTM117c-HIPl plasmid 
encoded for the LexA-HIPl fusion protein, and the pGAD427-LexA 
encoded for the GAL4ad-lexA fusion protein were known to auto- 
activate the readout system without any partner protein. 
Hence, automatic visual differentiation has preselected 
against these false positive clones and automatically created 
a regular grid pattern of yeast clones expressing a single 
fusion protein unable to activate the readout system. 

Example 9: Automated interaction mating to combine genetic 

elements in yeast cells 

9.1 Automated interaction mating on a solid support in 
regular pattern 
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The yeast strains that did not express auto- activating fusion 
proteins in section 8.1 were mated using an automated 
approach. Each of the yeast strains xla, x2a, yloc and y2oc was 
grown in every well of one of four microtiter plates 
containing SD-trp-ade medium for the Mata strains and SD-leu- 
ade medium for the Mata strains. Each plate was labelled with 
a unique barcode and using a spotting robot such as described 
by Lehrach et al. (1997), the yeast strains xla and x2a were 
transferred in a defined 2x2 duplicate pattern with an 
inter- spot spacing of 2mm to Hybond-N+ membrane (Amersham) 
which had been pre-soaked with YPD medium. The spotting robot 
then automatically transferred the yeast strains yloc and y2oc 
to the same respective spotting positions on each membrane as, 
and already containing the xla and x2a clones. The robot 
automatically sterilised the spotting tool, changed the 
microtiter plate between each set of clones transferred and 
created a data- file in which the spotting pattern produced and 
the barcode that had been automatically read from each 
microtiter plate was recorded. The spotted membranes were 
transferred to YPD plates and incubated for over night at 30 °C 
to allow mating and growth to occur. Each membrane was assayed 
for S-Gal activity using the method of Breeden & Nasmyth 
(1985) and was subsequently air dried overnight. A digital 
image of each dried filter was captured using a standard A3 
computer scanner and image processed as described in section 
4.1. The processed image was stored on computer and the 
identity of clones that expressed S-Galactosidase was 
determined using the image analysis system described in 
section 4.1. Figure 21 shows the results of automated 
interaction mating between the strains xla & yloc and x2a & 
y2ct. Both resulting diploid strains grew on YPD media, yet 
only the diploid strain resulting from the interaction mating 
of x2a & y2a that contained plasmids encoding the interacting 
fusion proteins LexA-SIMl & GAL4 ad - ARNT respectively, showed a 
LacZ+ phenotype and turned blue on incubation with X-Gal. No 
S-galactosidase activity was observed for the diploid strain 
resulting from the interaction mating between the strains xla 
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and ylot that contained plasmids encoding the proteins LexA and 
GAL4ad . 

9.2 Automated interaction mating based on liquid culture 

Two defined libraries of clones which express fusion proteins 
were created. First, the yeast strain L40ccu was transformed 
with the plasmid pLUA and a resulting stable transformant 
colony cultured in minimal medium lacking adenine. Cells from 
this culture were rendered competent and transformed with 3 pig 
pooled mixture of all six pBTM117c constructs shown -in Table 
2. Second, the yeast strain L40ccua was transformed with the 
plasmid pLUA and a resulting stable transformant colony 
cultured in minimal medium lacking adenine. Cells from this 
culture were rendered competent and transformed with 3 fig 
pooled mixture of all six pGAD427 constructs shown in Table 2. 
In all cases, competent cells were prepared and 
transformations conducted using the method of Schiestel & 
Gietz (1989) . 

The cells in the two resulting transformation mixes were 
allowed to recover by incubation at 30°C in YPD liquid medium 
for 2 hours before plating onto large 24 x 24 cm agar trays 
(Genetix, UK) . The Mata cells containing the pBTM117c fusion 
library were plated onto minimal medium lacking tryptophan and 
adenine but containing 0.2% 5-FOA (SD- trp-ade+FOA) , while the 
Mata cells containing the pGAD427 fusion library were plated 
onto minimal medium lacking leucine and adenine but containing 
0.2% 5-FOA (SD-leu-ade+FOA) . 

The colonies on the agar- trays were grown by incubation at 
30°C for 4 to 7 days. To minimise false positives arising from 
dormant cells, the colonies on the two agar- trays were 
replica-plated onto new agar- trays containing the same 
respective selective media as a given original tray using 
standard velvet replication. This replication procedure only 
transfered cells from the top of a growing colony and thus 
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reduced the carry over of dormant cells and hence the number 
of false positive clones in the yeast two-hybrid system. These 
replica agar-trays were incubated at 30°C for 4 to 7 days in 
order for the yeast cells to grow. 

To conduct the liquid interaction mating, the resulting Mata 
and Mata colonies were separately collected off both replica 
trays by washing with 20 ml of liquid minimal medium. These 
two mixtures of yeast clones were carefully resuspended, 
pelleted and washed with sterile distilled water before 
incubation in 100 ml of YPD in order to ensure that -the cells 
in both mixtures were mating competent. The two populations of 
mating competent cells were combined in 500 ml of YPD liquid 
media contained within a 10 litre flat bottomed flask and 
incubated at 30°C with very gentle shaking (< 60 rpm) 
overnight to allow interaction mating to proceed. The 
resulting mixture of diploid cells was pelleted by gentle 
centrifugation at 3000 rpm for 5 min, washed twice with 50 ml 
of sterile distilled water and finally, 10 ml of the resulting 
cell suspension was plated onto each of five 24 x 24 cm agar- 
trays containing 300 ml of minimal medium lacking leucine, 
trptophan, adenine, histidine and uracil ( SD-leu- trp-ade-his- 
ura) . The agar trays were poured using an agar-autoclave and 
pump (Integra, Switzerland) to minimise tray-to-tray variation 
in agar colour and depth. After plating, the colonies were 
grown by incubating the trays at 30°C for 4 to 7 days. 

After incubation, the resulting diploid yeast cells expressing 
interacting fusion proteins were automatically picked using 
our modified picking system as described in section 3.1 except 
that the picked clones were inoculated into microtiter plates 
containing the liquid selective medium SD-leu- trp-ade/7% 
glycerol. The interaction library comprising the diploid yeast 
cells contained in the microtiter plates were grown by 
incubation at 30 °C as described in section 3.1. Two further 
copies of the interaction library were made into new 
microtiter plates containing SD-leu- trp-ade/7% glycerol growth 
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medium, all plates were individually labelled with a unique 
barcode and stored at $70 °C until required for further 
analysis as described in section 3.1. 

It will be clear that higher density regular grid-patterns of 
such an interaction library can be easily generated by a 
person skilled in the art from these microtiter plates of 
diploid yeast cells by following the methods disclosed in 
sections 3.2, 3.3 and 3.4 of this invention. The creation of 
high-density regular grid patterns of diploid yeast cells can 
be conducted using the procedures as described in earlier 
sections. These arrays can be used to assay reporter gene 
activity, or for generation of nucleic acid arrays for 
hybridisation. Modifications to selective medium may be 
required which a person skilled in the art will recognise. 

Example 10: Application of the improved two-hybrid system 
to a prokaryotic two-hybrid system 

10.1 Strains, readout systems and vectors 

Two E.coli strains KS1-0R2HF* and KS1-OR2HF" were created that 
carry the sacB conterselective marker under the control of the 
placO R 2-62 promoter, and also the tetracycline selective gene 
under the control of a second placO R 2-62 promoter. Both 
strains have the sacB counterselective reporter gene stabley 
inserted within the E.coli chromosome by knock-out of the 
arabinose operon to enable arabinaose controlled inducible 
promoters to be utilised. The selective Tet . reporter gene is 
stabley inserted in within the chromosome by knock-out of the 
lactose operon which also enables a lacY counterselective 
marker to be utilised. Strain KS1-OR2HF* was created by 
transformation of the fertility conferring F 1 plasmid into 
KS1-0R2HF". KS1-0R2HF" was created by site-specific knock-out 
and insertion of the sacB reporter gene construct into the 
arabinose operon of strain KSl-ORTet by transformation of the 
plasmid pK03-araOrsacB and subsequent selection for stable 
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insertions using the method of Link et al. (1997) pK03- 
araOrsacB was prepared by blunt -ended ligation of a 1.4 Kb 
OrsacB fragment into Stu I digested pK03-ARA to produce an 
insert of the OrsacB fragments flanked by 2.5 Kb bp and 1.0 Kb 
of the 3 ! and 5' ends of the E.coli arabinose operon 
respectively. pK03-ARA carries the complete arabinose E.coli 
operon which had been amplified by PCR from E.coli genomic DNA 
using tailed primers, digested with Sal I and cloned into the 
Sal I site of pK03 by standard procedures. The OrsacB fragment 
was created by ligating together PCR fragments of the placO R 2- 
62 promoter and the sacB gene. The placO R 2-62 promoter and 
sacB PCR fragments were amplifed using standard procedures and 
anchor primers which gave rise to complementary overhangs 
between the two consecutive fragments which were subsequently 
annealed to generate the chimeric sequence (see, for example, 
Current Protocols in Molecular Biology, Eds. Ausubel et al . 
John Wiley & Sons: 1992) from the plasmids KJ306-31 and pK03 . 
The lac promotor derivative plac0 R 2-62 carried by the plasmid 
KJ306-31 was prepared by cleaving the plasmid KJ306 with Hinc 
II and inserting a 31bp linker sequence (Dove et al . 1997). 
The strain KSl-ORTet was created by site-specific knock-out 
and insertion of a tetracycline reporter gene under the 
control of the placO R 2-62 promoter into the lactose operon of 
strain KS1F" also by genomic knock-out utilising the pK03 
system. The tetracycline gene was obtained by PCR of the 
plasmid pACYC184. Modifications to the above knock-out 
insertion method were made to make an appropriate pK03 
construct to enable the knock-out insertion of the chimeric 
tetracycline reporter gene into the lactose operon as will be 
possible by a person skilled in the art. The E.coli strain 
KS1F" was constructed from KS1 (Dove et al . ) by removal of the 
F 1 plasmid using standard plasmid curing procedures. 

Two vectors, pB AD 1 8 - aRNAP and pBAD30-d were constructed to 
provide further genetic features to enable the method of 
invention (Figure 22) . The vectors are based on the pBAD 
series of vectors which provide tight inductive control 
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expression of cloned genes using the promoter from the 
arabinose operon (Guzman et al . , 1995 J. Bact. 177: 4141-4130, 
and can be maintained in the same E.coli cell by virtue of 
compatible origins of replication. The plasmid pB AD 1 8 - aRNAP 
expresses under the control of the arabiose promoter, fusion 
proteins between the a amino terminal domain (NTD) of the oc- 
subunit of RNA polymerase and DNA fragments cloned into the 
multiple cloning site. The presence of this plasmid in 
kanamycin sensitive cells can be selected by plating on growth 
medium supplemented with kanamycin, or for its absence by the 
counterselective rpsL allele by plating on media supplemented 
with streptomycin (Murphy et al. 1995) . The plasmid pBAD30-d 
expresses under the control of the arabinose promoter, fusion 
proteins between the Xcl protein and DNA fragments cloned into 
the multiple cloning site. The presence of this plasmid in 
amplicillin sensitive cells can be selected by plating on 
growth medium supplemented with amplicillin, or for its 
absence by the counterselective lacY gene by plating on media 
supplemented with 2-nitrophenyl-S-D-thiogalactosidase (tONPG) 
(Murphy et al . 1995). Additionally, the 288 bp oriT sequence 
enables unidirectional genetic exchange of the pBAD3 0-cI 
plasmid and its derivatives from E.coli cells containing the 
F 1 fertility factor to F~ strains lacking the fertility 
factor. 

The plasmid pBAD18 -aRNAP was constructed by cloning a 0.7 Kb 
DNA fragment encoding the a amino terminal domain (NTD) 
(residues 1-248) of the a-subunit of RNA polymerase (a-NTD) 
into Eco RI digested pBAD18-CS. The 0 . 7 Kb a-NTD fragment was 
isolated by PCR from the plasmid pHTfla (Tang et al., 1994 
Genes Dev 8: 3058-3067). The plasmid pBAD18-CS was obtained by 
site-specific insertion assisted by PCR cloning of the 400 bp 
coding region and translational start site of the rpsL allele 
into pBADl8-Kan (Guzman et al 1995) before the transcriptional 
termination signal of the kanamycin gene to enable 
polycistronic transcription of the counterselective and 
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selective markers. The rpsL allele was obtained by PCR 
amplification of the plasmid pN01523 (Murphy et al . 1995). 

The plasmid pBAD30-d was constructed by cloning a 730 bp DNA 
fragment encoding the Xcl protein into Eco RI digested pBAD30- 
TCS. The 730 bp fragment encoding the Xcl protein was isolated 
by PCR from the plasmid pACkd (Dove et al 1997) . The plasmid 
pBAD30-TCS was obtained by site-specific insertion assisted by 
PCR cloning of the 1.3 Kb coding region and translational 
start site of the lacY gene into pBAD30-T before the 
transcriptional termination signal of the ampicillin gene to 
enable polycistronic transcription of the counterselective and 
selective markers . The 2acY gene was obtained by PCR 
amplification of the plasmid pCMlO (Murphy et al . 1995). The 
plasmid pBAD3 0-T was obtained by site specific insertion of a 
288 bp oriT sequence obtained by PCR from the F' plasmid 
between the M13 intergenic region and cat' locus of pBAD30 
(Guzman et al 1995) . 

10.2 Detection and identification of interacting proteins 
using a large-scale and automated prokaryotic two-hybrid 
system 

Generation of a libraries of E.coli cells expressing fusion 
proteins 

The pSportl plasmid extraction containing the amplified cDNA 
library of Strongylocentrotus purpuratus described in section 
6.1 was used. Approximately 1 fig of the library inserts were 
then isolated from the plasmid DNA by Hind III /Sal 1 digestion 
and size selective (1-1. 5Kb) agarose gel purification using 
standard procedures. 

The two plasmids pBAD18 -aRNAP and pBAD30-cI were prepared by 
digestion with Hind III/ Sal 1. The insert mixture that was 
isolated as above was split into two equal fractions and 300 
ng was ligated with 50 ng of each of the two prepared 
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plasmids. Following ligation, the pBAD 1 8 - ocRNAP reaction was 
then transformed into competent KS1-0R2HF" E.coli cells, and 
the pBAD30-cI was transformed into competent KS1-0R2HF* E.coli 
cells . 

Genetic preselection against false positive clones and the 
automated creation of a regular grid-pattern of E.coli cells 
expressing a fusion protein 

The two transformation mixes were plated onto large 24 x 24 cm 
agar trays (Genetix, UK) containing selective media.- The F" 
cells containing the pBAD18-aRNAP fusion library were plated 
onto LB selective medium supplemented with kanamycin (50 
ug/ml) , arabinose (0.2% w/v) and sucrose (5% w/v) . The F + 
cells containing the pBAD30-d fusion library were plated LB 
selective medium supplemented with amplicillin (100 ug/ml), 
arabinose (0.2%) and sucrose (5%). The agar trays were poured 
using an agar-autoclave and pump (Integra, Switzerland) to 
minimise tray- to- tray variation in agar colour and depth. 
After plating, the colonies were grown by incubating the trays 
at 37°C for 18 to 24 hours. The E.coli cells expressed fusion 
proteins under the control of the arabinose promoter, and 
those cells expressing single fusion proteins able to auto- 
activate the sacB reporter gene were unable to grow, since 
expression of the sacB gene confers sensitivity to sucrose 
supplemented in the growth media at high concentrations. 

Automated picking of E.coli clones for DNA analysis using 
vision- controlled robotic systems such as described in Lehrach 
et al. (1997) is well known in the art. Such systems should 
also be appropriate for the analysis of E.coli cells that 
express interacting or potentially interacting fusion 
proteins. Therefore, a laboratory picking robot was used to 
automatically pick individual E.coli colonies from the 
selective agar-trays into individual wells of a sterile 384- 
well microtiter plate (Genetix, UK) containing sterile liquid 
medium. The cells expressing the pBAD18-aRNAP fusion library 
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were inoculated into liquid LB selective medium supplemented 
with kanamycin (50 ug/ml) and 10% (v/v) glycerol 
(LB+Kan/10%Gly) , while the cells expressing the pBAD30-d 
fusion library were inoculated into LB selective medium 
supplemented with amplicillin (100 ug/ml) and 10% (v/v) 
glycerol (LB+Amp/10%Gly) . The resulting microtiter plates were 
incubated at 37 °C for 18 to 24 hours, and after growth of 
E.coli strains within the microtiter plates, each plate was 
labelled with a unique number and barcode. The plates were 
also replicated to create two additional copies using a 
sterile 384-pin plastic replicator (Genetix, UK) to -transfer a 
small amount of cell material from each well into pre-labelled 
384 -well microtiter plates and pre-filled with the liquid 
selective medium containing 10% glycerol as was appropriate 
for the E.coli strain. The replicated plates were incubated at 
37 °C for 18 to 24 hours, subsequently labelled, frozen and 
stored at -70°C together with the original picked microtiter 
plates of the libraries of E.coli cells expressing fusion 
proteins . 

In this manner, we generated a regular grid patterns of E.coli 
cells expressing fusion proteins using a robotic and automated 
picking system. 384-well microtiter plates have a well every 
4.5 mm in a 16 by 24 well arrangement. Therefore, for each 
384-well microtiter plate we automatically created a regular 
grid pattern at a density greater that 4 clones per square 
centimetre. It will be clear that higher density regular grid- 
patterns of such an interaction library can be easily 
generated by a person skilled in the art from these microtiter 
plates of E.coli cells by following the methods disclosed in 
sections 3.2, 3.3 and 3.4 of this invention. For example, 
densities of greater than 19 clones per square centimetre can 
be obtained by robotic pipetting of clones into wells of a 
1536-well microtiter plate. 
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Visual differentiation against false positive clones and the 
automated creation of a regular grid-pattern of E.coli cells 
expressing a fusion protein 

To demonstrate that visual differentiation against cells that 
express single fusion proteins that auto-activate the readout 
system could be applied to a prokaryotic two-hybrid system, 
the libraries of fusion proteins described in section 10.2.1 
were utilised. The two transformation mixes were plated onto 
large 24 x 24 cm agar trays (Genetix, UK) containing selective 
media. The F~ cells containing the pBADl 8 - ctRNAP fusion library 
were plated onto LB selective medium supplemented with 
kanamycin (50 ug/ml) , arabinose (0.2%) and X-Gal (2 ug/ml) . 
The F + cells containing the pBAD30-d fusion library were 
plated LB selective medium supplemented with amplicillin (100 
ug/ml), arabinose (0.2%) and X-Gal (2 ug/ml). The agar trays 
were poured using an agar- autoclave and pump (Integra, 
Switzerland) to minimise tray-to-tray variation in agar colour 
and depth. After plating, the colonies were grown by 
incubating the trays at 37°C for 18 to 24 hours and to allow 
any blue colour of colonies to develop. The E.coli cells 
expressed fusion protein under the control of the arabinose 
promoter, and those cells expressing fusion proteins able to 
auto- activate the lacZ reporter gene turned blue by enzymatic 
reaction of the X-Gal substrate as is well known in the art. 

Using an automated picking system, white E.coli cells 
expressing single fusion proteins unable to activate the 
readout system were automatically visually differentiated from 
false positive E.coli cells that had turned blue and only 
white E.coli cells were arrayed in a regular grid pattern. A 
standard laboratory picking robot (Lehrach et al., 1997) was 
used except that the improvements relating to reliable sorting 
of white from blue yeast colonies as described in section 3.1 
was also used to reliably discriminate between white and blue 
E.coli colonies. White E.coli colonies from the two sets of 
agar trays prepared above were automatically picked and 
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inoculated into the appropriate selective media in 384 -well 
microtiter plates as described in section 10.2. It will be 
recognised by a person skilled in the art that higher density 
regular grid patterns of these clones may easily be formed. 

Automated interaction conjugation to combine genetic elements 
in E.coli cells 

It will be clear to a person skilled in the art that automated 
interaction mating on a solid support as described for yeast 
cells in section 9.1 is equally appropriate for E.coli cells 
of different conjugation types that have been selected by the 
methods of genetic preselection or visual differentiation as 
disclosed in this invention. In such case, appropriate 
modifications to the selective media would be required. 
However, a person skilled in the art would be able to 
recognise and effect said modifications to the selective media 
by following the disclosures herein. 

To demonstrate an automated approach to interaction 
conjugation based on liquid culture, two libraries of clones 
that express fusion proteins were prepared as described in 
section 10.1. The F" cells containing the pB AD 1 8 - aRNAP fusion 
library were plated onto LB selective medium supplemented with 
kanamycin (50 ug/ml) , arabinose (0.2%) and sucrose (5%). The 
F + cells containing the pBAD30-cI fusion library were plated 
LB selective medium supplemented with ampicillin (100 ug/ml) , 
arabinose (0.2%) and sucrose (5%). 

To conduct the liquid interaction conjugation, the resulting 
F and F + colonies were separately collected off the agar- 
trays by washing with 20 ml of liquid LB medium. These two 
mixtures of E.coli clones were carefully resuspended, pelleted 
and washed with LB. The two populations of cells were combined 
in 500 ml of LB liquid media and incubated at 37 °C with gentle 
shaking for 6 hours to allow interaction conjugation to 
proceed. The resulting mixture of E.coli cells was pelleted by 
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gentle centrifugation at 3000 rpm for 5 min, washed twice with 
50 ml of LB liquid media and finally, 10 ml of the resulting 
cell suspension was plated onto each of five 24 x 24 cm agar- 
trays containing 300 ml of the solid LB selective medium 
supplemented with ampicillin (100 ug/ml) , kanamycin (50 
ug/ml) , arabinose (0.2%) and tetracycline (35 ug/ml) 
(LA+Amp+Kan+Tet+ara) . The agar trays were poured using an 
agar- autoclave and pump (Integra, Switzerland) to minimise 
tray-to-tray variation in agar colour and depth. After 
plating, the colonies were grown by incubating the trays at 
37°C for 18 to 24 hours. 

After incubation, resulting E.coli cells that expressed 
interacting fusion proteins grew on the surface of the 
selective agar, and were automatically picked using a 
laboratory picking system as described in section 10.2 except 
that picked clones were inoculated into microtiter plates 
containing the liquid LB medium supplemented with ampicillin 
(100 ug/ml) , kanamycin (50 ug/ml) and 10% (v/v) glycerol 
(LB+Amp+Kan/lO%Gly) . The interaction library comprising the 
E.coli cells contained in the microtiter plates were grown by 
incubation at 37 °C for 18 to 24 hours. Two further copies of 
the interaction library were made into new microtiter plates 
containing LB+Amp+Kan/lO%Glyc growth medium, all plates were 
individually labelled with a unique barcode and stored at §70 
°C until required for further analysis as described above. It 
will be recognised by a person skilled in the art that higher 
density regular grid patterns of these clones may easily be 
formed. 

Generation of a regular grid pattern of clones from an 
interaction library on planar carriers using automation 

A high- throughput spotting robot such as that described by 
Lehrach et al. (1997) was used to construct porous planar 
carriers with a high-density regular grid-pattern of E.coli 
clones from the defined interaction library contained within 
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3 84 -well microtiter plates that is described above. The robot 
recorded the position of individual clones in the high-density 
grid-pattern by the use of a pre-defined duplicate spotting 
pattern and the barcode of the microtiter plate. Individually 
numbered membrane sheets sized 222 x 222 mm (Hybond N+, 
Amersham UK) were pre- soaked in LB medium, laid on a sheet of 
3 MM filter paper (Whatmann, UK) also pre-soaked in LB medium 
and placed in the bed of the robot. The interaction library 
was automatically arrayed as replica copies onto the membranes 
using a 384-pin spotting tool affixed to the robot. Microtiter 
plates from the first copy of the interaction library were 
replica spotted in a "5x5 duplicate 1 pattern around a central 
ink guide- spot onto 10 nylon membranes - corresponding to 
positions for over 27,000clones spotted at a density of over 
100 spots per cm2 . The robot created a data-file in which the 
spotting pattern produced and the barcode that had been 
automatically read from each microtiter plate was recorded. 

Each membrane was carefully laid onto approximately 300 ml of 
solid agar media in 24 x 24 cm agar-trays. Six membranes were 
transferred to LB+Amp+Kan+Tet agar containing 0.2% arabinose 
and two each of the remaining membranes were transferred to 
either LB agar supplemented with kanamycin (50 ug/ml) , 
arabinose (0.2%) and tONPG (1 mM) (LB+kan+ara+tONPG) or LB 
agar supplemented with amplicillin (100 ug/ml) , arabinose 
(0.2%) and streptomycin (at an appropriate concentration for 
counterselction) (LB+Amp+ara+Sm) . The E.coli colonies were 
allowed to grow on the surface of the membrane by incubation 
at 37 °C for 18 to 24 hours. 

Detection of the readout system in a regular grid pattern 

Two membranes from each of the selective media was processed 
to detect S-galacosidase activity using the method of Breeden 
& Nasmyth (1985) and a digital image was captured and stored 
on computer as described in section 4.1. Using the image 
analysis and computer systems described section 4.1, positive 
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E.coli clones were identified by consideration of the 
activation state of the fi-galactosidase readout system when 
clones had been grown on the various selective media. Positive 
clones were identified as those that turned blue after growth 
on the selective media LB+Amp+Kan+Tet+ara but not when grown 
on either of the counterselective media LB+Kan+ara+tONPG or 
LB+Amp+ara+Sm. 

Identification of individual members of the interaction 

A positive E.coli clone (identified as 15F09) that expressed 
interacting fusion proteins as determined by the computer 
systems as described above, was recovered from a stored frozen 
copy of the interaction library. Both members comprising the 
interaction were recovered by specific PCR amplification of 
the insets carried by the p BAD 1 8 - aRNAP and pBAD3 0-d plasmids 
directly from the E.coli culture using plasmid- specif ic 
primers. Both members of the interaction were sequenced by 
standard procedures, and the information entered into a data- 
base as described in Example 7 , 

As described in section 4.1, high-density arrays of DNA 
representing interaction libraries or members comprising 
interaction libraries can be made by transfer to solid 
supports by a variety of means. To demonstrate the 
applicability of DNA hybridisation to identify E.coli clones 
carrying plasmids that encode for interacting fusion proteins, 
one membrane that had been taken from the LB+Amp+Kan+Tet+ara 
growth medium was processed to affix the DNA carried by the 
E.coli cells comprising the interaction library according to 
the method of Hoheisel et al (1991) . The insert carried by the 
pBAD30-d plasmid of clone 15F09 was radioactively labelled by 
the method of Feinberg & Vogelstein, (1983) and used as a 
hybridisation probe to the DNA array, and positive signals 
identified as described in section 4.1. A clone (22C11) was 
identified as hybridising to the probe and was shown to be a 
positive clone by query of the data based described in section 
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4.1. In this manner, further steps in a protein-protein 
interaction pathway can be identified by hybridisation, 
consideration of reporter gene activation of hybridisation- 
positive clones and recovery of plasmids encoding members 
comprising these interactions. Recovery of the plasmids allows 
further investigation such as DNA sequencing to identify the 
members or repeated hybridisation to identify further steps in 
the protein-protein interaction pathway and hence develop 
protein-protein interaction networks as described in section 
6.6. 

Example 11: Application of the improved two-hybrid system 
to a mammalian two-hybrid system 

11.1 Strains, readout systems and vectors 

The human embryonic kidney fibroblast -derived cell line HEK 
293 (or simply 293 cells) is especially suitable for mammalian 
2H studies due to its high susceptibility for DNA during 
transfection (Graham, F.L. and Van der Eb, A.J. (1973), Virol. 
54: 536-539; Graham, F.L., Smiley, J., Russel, W.C. and Nairn, 
R. (1977), J. Gen. Virol. 36: 59-72). The cell line is 
available from ATCC. 

Plasmids carrying teh mammalian readout systems named 
pG5ElbEGFPneo, pG5ElbEGFPhyg or pGSElbEGFPpur are used. These 
plasmids contain the TATA element of the adenoviral Elb gene 
and five tandem copies of the GAL4 responsive element UAS G (5" 
CGGAGTACTGTCC TGCG 3 1 ) (Sadowski, I., Ma, J., Treizenberg, S. 
and Ptashne, M . (1988), Nature 335: 559-560) positioned 
immediately upstream of the coding sequence for the enhanced 
green fluorescent protein (EGFP; Yang, T.T., Cheng, L. and 
Kain, S.R. (1996), Nucl. Acids Res. 24 (22): 4592-4593). These 
reporter plasmids are generated by replacing the coding 
sequence for CAT in GSElbCAT (Dang, C.V., Barrett, J., Villa- 
Garcia, M. , Resar, L.M.S., Kato, G.J. and Fearon, E.R. (1991), 
Mol. Cell. Biol. 11: 954-962) by the EGFP coding sequence and 
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introducing either a neomycin, hygromycin or puromycin 
resistance marker gene (neo r , hyg r or pur r ) using standard 
subcloning procedures. 

The plasmids pMneol,2,3 or pMhygl,2,3, which are derived from 
pMl,2,3 (Sadowski, I., Bell, B. # Broad, P. and Hollis, M . 
(1992), Gene 118: 137-141) by insertion of either neo r or hyg r 
marker gene using standard subcloning procedures, are series 
(1,2,3 correspond to three possible reading frames) of 
improved Gal4p- fusion vectors derived from the pSG424 
plasmid, which was designed for mammalian expression of fusion 
proteins that contain the DNA-binding domain of the yeast Gal4 
protein (Sadowski, I. and Ptashne, M. (1989), Nucl. Acids Res. 
17: 7539) . This vector contains a polylinker preceded by 
coding sequences for Gal4p amino acids 1-147. Thus, a hybrid 
reading frame that encodes a Gal4p- fusion protein can be 
generated by inserting cDNA sequences into the polylinker 
region of pSG424/pNTs. Transcripts of the hybrid reading frame 
are inititated from the SV40 early promoter and their 
processing is facilitated by the SV40 polyadenylation signal. 
Alternatively, the hybrid reading frames are subcloned into 
pLXSN or any other similar retroviral vector to allow 
packaging cell line-aided infection of target cells. 

The plasmids pVP-Nconeo and pVP-Ncohyg are derived from pVP- 
Nco vector (Tsan, J., Wang, Z., Jin, Y., Hwang, L., Bash, 
R.O., Baer, R. The Yeast Two-Hybrid System, edn 1. Edited by 
Bartel, P.L., Fileds, S. New York: Oxford University Press 
(1997) : 217-232) by insertion of either a neo r or hyg r marker 
gene using standard subcloning procedures. pVP-Nco in turn is 
an improved version of the pNLVP16 plasmid, which was 
constructed for the expression of herpes simplex virus protein 
VP16-fusion proteins in mammalian cells (Dang, C.V. , Barrett, 
J., Villa-Garcia, M . , Resar, L.M.S., Kato, G.J. and Fearon, 
E.R. (1991), Mol. Cell. Biol. 11: 954-962). A polylinker 
sequence is preceded by an artificial reading frame including 
the eleven amino- terminal residues of Gal4p ( MKLLSS IEQAC ) , a 
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nuclear localization signal from the SV40 large T antigen 
(PKKKRKVD) and the acidic transact ivat ion domain (amino acids 
411-456) of the VP16 protein. Alternatively, the hybrid 
reading frames comprising Gal4 (1-147) and individual 
sequences of a cDNA library are subcloned into pLXSN or any 
other similar retroviral vector to allow packaging cell line- 
aided infection of target cells. 

11.2 Detection and Identification of Interacting Proteins 

A number of monoclonal 293 cell lines stably containing the 
pG5ElbEGFPneo- , pG5ElbEGFPhyg or pG5ElbEGFPpur readout system 
are generated by the method of calcium phosphate transfection 
(Chen, C. and Okayama, H. (1987), Mol. Cell. Biol. 7:2745- 
2752) , lipof ectamine transfection or any other common 
transfection method, followed by selection in G418, 
hygromycinB (HygB) or puromycin containing medium, 
respectively. It is tested subsequently which particular clone 
is most appropriate (number of readout system copies and 
site(s) of integration into the host chromosomes may influence 
expression levels and inducibility of the reporter gene) for 
the method of invention. 

The selected 293-G5ElbEGFPneo, 293 -G5ElbEGFPhyg or 293- 
GSElbEGFPpur reporter cell line is used as a "modified host 
cell strain" to perform the method of invention (detection and 
identification of interacting proteins) . 

Two pools representing all three reading frames of the two 
vector series pMneo or Mhyg and pVP-Nconeo or pVP-Ncohyg were 
prepared by Not 1/Sal 1 digestion and pooling of 1 /xg each of 
vectors pMneo / pMhyg 1,2,3 and pVP-Nconeo / pVP-Ncohyg 1,2,3 
respectively. 300 ng of a cDNA insert mixture that was 
isolated as described in section 6.1 was split into two equal 
fractions and was ligated with 50 ng of each prepared vector- 
series pool. Following ligation, each reaction was then 
separately transformed into electro-competent E.coli cells, 
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and recombinant clones for each library were selected on five 
24 x 24 cm plates ampicillin. Approximately 500 /xg of the pVP- 
Nconeo / pVP-Ncohyg and 500 /xg of the pMneo / pMhyg libraries 
were extracted from E.coli transf ormants by washing off the 
plated cells and a subsequent QiaPrep plasmid extraction of 
the wash mixture as described above. 16 /xg of each vector was 
used to transf ect a 10cm plate of 293 cells. 

11.3 Pre-selection against False Positives by visual 
differentiation 

The pMneol,2,3 or pMhygl,2,3 plasmids containing the cDNA 
library fused to the Gal4-DNA binding domain were transfected 
into the selected 293 reporter cell line. For infection with 
retroviruses, designated packaging cell lines are transfected 
with the respective retroviral vectors and virus -containing 
supernatant from such cultures is then used to infect the 
reporter cell line (according to standard protocols; e.g. 
Redemann, N. , Holzmann, v.Ruden, T., Wagner, E.F., 
Schlessinger, J. and Ullrich, A. (1992), Mol. Cell. Biol. 12: 
491-498) . Transfection and infection protocols can be 
optimized in a way to introduce on average only one plasmid 
per cell by adjusting the plasmid concentration for 
transfection or the virus titer during infection. Antibiotics 
G418 or HygB are employed to select for successfully 
transf ected/inf ected reporter cells. 

At this stage it is necessary to eliminate those cells that 
display a readout system activation as a consequence of only 
expressing a DNA-binding domain- fusion protein (in which case 
the fusion protein would be referred to as an "auto- 
activator") , instead of requiring an appropriate (interacting) 
transact ivat ion domain- fusion protein to be coexpressed. Thus, 
the resultant polyclonal pool of stably transf ected/inf ected 
reporter cells is then subjected to a preselection screen 
using the readout system to visually differentiate cells that 
express auto-activating fusion proteins. In the EGFP-based 
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readout system cells expressing auto-activators can be 
identified by screening for expression of EGFP and 
consequently for the ability of the respective cells to emit a 
green fluorescent light (507 nm) upon stimulation with the 
appropriate excitatory wavelength (488 nm) (Yang, T.T., Cheng, 
L. and Kain, S.R. (1996), Nucl . Acids Res. 24 (22): 4592- 
4593) . Monitoring readout system activation is either done by 
eye using a fluorescence microscope or by an automated 
detection device. The cells that activated the GRP reporter 
system were visually differentiated and sorted from otehr 
cells not actiavting the reporter system using a flouorescent 
assisted cell sorting deivce (FACS) . Alternatively, 
elimination of false positive cells expressing auto-activators 
is either done manually or by removal /killing of cells by 
means of a suction pump or a micromanipulator or by a 
detector- linked automated system employing micromanipulators 
or a laser ablation device. 

After elimination of cells that express autoactivating fusion 
proteins, the remaining polyclonal pool of 2 93 reporter cells 
expressing DNA-binding fusion proteins are then subjected to a 
second transfection /infection step as described above using 
pVP-Nconeo or pVP-Ncohyg plasmids or respective retroviral 
derivatives containing the cDNA library fused to the VP16 
transactivator sequence. Selection for successfully 
transfected/inf ected cells employing antibiotics G418 or HygB 
is optional here. If selection is desired it has to be made 
sure that the resistance marker that forms part of the readout 
system is different from the marker genes on previously 
transfected/inf ected vectors. Addition of the antibiotics 
selecting for the second transf ection/inf ection-vector may be 
necessary, if the subsequent screening/final selection 
procedures take several days to complete, in order to prevent 
loss/diluting out of the plasmids in the absence of selective 
pressure. A complete selection also eliminates cells that have 
not been successfully cotransf ected (i.e. have not received a 
pVP-Nco-plasmid) , although such cells would not be a major 
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problem (as long as transfection/inf ection efficiency is high) 
because they would not be identified by the interaction 
screening anyway. It is also noteworthy that the longer the 
cells are kept in culture until cell lysis (and molecular 
analyses of the interacting proteins and their corresponding 
cDNA sequences) the more likely it is to loose cDNAs that 
encode for more or less toxic fusion proteins. 

11.4 Automated Identification of Cells Expressing Interacting 
Proteins 

The resulting polyclonal pool of doubly transfected reporter 
cells is then subjected to visual screening for interacting 
proteins as described for the visual preselection. Green 
fluorescent ("positive") cells, indicative of the expression 
of two interacting proteins were automatically sorted using a 
FACS system to arrange cells in a regualr grid patternin wells 
of a mirotitre plate. Subsequently, single cell PCR and DNA 
sequencing was conducted to identify members comprising the 
interactions. Alternatively, the positive cells can be seeded 
onto a culture dish in a regular array/grid pattern. Cells 
might also be placed one by one into small wells of a 
multiwell dish and provided with an appropriate growth factor- 
supplemented medium or conditioned medium to allow the cells 
to survive and grow in isolation from other cells. 

11.5 Double Preselection and Cell Fusion 

The cotransf ection protocol described above only includes a 
single preselection (instead of a double preselection) . It 
does not include the possibility of a preselection against 
false positive clones arising from pVP-Nco ( transact ivation 
domain-cDNA fusion library) plasmids . Although the number of 
false positives from pVP-Nco plasmids is usually much lower 
than from pMl,2,3 (DNA binding domain-cDNA fusion library) 
plasmids , it may under some circumstances be necessary to 
apply a double preselection strategy. 
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To that end two different polyclonal pools of stable cell 
lines expressing either members of the pM- or pVP-Nco-cDNA 
fusion library are generated by transf ection/inf ection of the 
293 reporter cell line and selected by means of the respective 
(different) antibiotics (G418 and HygB) as described above. 
Both pools of cell lines are then subjected separately to 
preselection and elimination of false positive clones as 
detailed above . 

In order to combine both fusion vectors and their 
corresponding expressed fusion proteins in one cell, 
individual cells of both pools of cell lines are fused 
together using state-of-the-art cell fusion-protocols 
involving PEG- facilitated electrof usion as described in Li, 
L.-H. and Hui, S.W. (1994), Biophys . J. 67: 2361-2366; Hui, 
S.W., Stoicheva, N. and Zhao, Y.-L. (1996), Biophys. J. 71: 
1123-1130, and Stoicheva, N. and Hui, S.W. (1994), Membrane 
Biol. 140: 177-182. Fusions between one cell of both pools is 
desired. For that purpose one cell of each pool is placed into 
each well of a multiwell dish as detailed above. After cell 
fusion, the combined cells are then subjected to visual 
selection. Cells are left on the same dish for visual or 
automated screening or collected and sorted by FACS . 

11.6 Double Preselection and Cell Fusion Using an Inducible 
Expression System 

A disadvantage of the above described double preselection 
method is that proteins with toxic or anti-prolif erative 
effects and their corresponding cDNAs will be lost during the 
lengthy selection process necessary to establish polyclonal 
pools of stable cell lines for both cDNA- fusion library- 
sequences. In order to prevent elimination of cDNA sequences 
encoding for toxic/ anti-prolif erative proteins one can 
combine the double preselection strategy with the following 
inducible system. 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



106 



The host cell strain is a 293 cell line which expresses a 
tetracycline (Tet) -controlled transactivator (tTA) , which is a 
fusion of amino acids 1-207 of the tetracycline repressor 
(TetR) and the C- terminal activation domain (13 0 amino acids) 
of herpes simplex virus protein VP16. The cell line is called 
293 Tet-Off as tTA is able to activate transcription from a 
Tet operator sequence (tetO) -controlled gene only in the 
absence of Tet. The reverse situation exists in the 293 Tet-On 
cell line, which stably expresses a reverse tTA ((r)tTA) that 
requires the presence of Tet to induce transcription from 
tetO-regulated genes. Both, 293 Tet-Off and 293 Tet-On cell 
lines are G418-resistant (neo r ) . These cell lines are 
available through Clonetech Inc. . tTA plasmids used to 
generate 293 Tet-Off and 293 Tet-On cell lines are described 
in Gossen, M. and Bujard, H. (1992), Proc . Natl. Acad. Sci. 
USA 89: 5547-5551 and in Gossen M. , Freundlieb, S., Bender, 
G., Muller, G . , Hillen, W. and Bujard, H. (1995), Science 268: 
1766-1769 . 

293 Tet-On or -Off cell lines are then transfected with a 
readout system (described in 11. 1.) and the reporter cell 
lines 293 Tet-On- or -Of f -pG5ElbEGFPhyg/pur are generated 
through selection in G418 or HygB. 

The sequences for the Gal4 -DNA binding domain and for the SV4 0 
nuclear localisation signal/VP16 transactivation domain 
(details and references as given in 11.1) are retrieved from 
pM and pVP-Nco plasmids and separately subcloned into the 
polylinker of pREV-TRE, a retroviral vector (Clonetech Inc.) 
to generate pRE V - TRE - Ga 1 4 and pREV-TRE-VP16 . pREV-TRE contains 
the retroviral extended packaging signal, Y+, which allows for 
production of infectious but replication- incompetent virus in 
conjunction with a packaging cell line such as PT67, followed 
by a hyg r gene (selectable marker) and seven copies of tetO 
fused to the cytomegalovirus (CMV) minimal promoter 
immediately 5 'of the polylinker. V F+ and polylinker sequences 
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are flanked by 5 'and 3"LTRs, respectively. pREV-TRE is 
available from Clonetech Inc.. cDNA libraries are subcloned 
into the polylinker of pREV-TRE. 

The above described reporter cell lines are separately 
infected with either pRE V - TRE - Ga 1 4 - or pREV-TRE-VP16 -derived 
retroviral particles. A polyclonal pool of new stable cell 
lines is selected in both cases using the resistance selection 
marker gene hyg r . Transient expression of fusion proteins from 
pREV-TRE plasmids has to be induced by withdrawal (Tet -Off ) or 
addition (Tet-On) of Tet in order to allow for double 
preselection and elimination of false positives as described 
above . 

11.7 Cell Fusion and Selection for Cells Expressing 
Interacting Proteins 

The remaining polyclonal pools of cell lines are then 
subjected to cell fusion as described above. The HygB 
concentration in the culture medium is increased to minimize a 
possible loss of either one component of the pairs of fusion 
protein cDNA sequences present in all fused cells. For the 
detection of positive clones, i.e. cells expressing a pair of 
interacting proteins (as detailed above) , expression of fusion 
proteins has to be induced by addition or withdrawal of Tet. 
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Table 1 

Oligonucleotide adapters for the construction of the novel 
yeast two-hybrid vectors pBTM118 a, b and c and pGAD428 a, b 
and c. 
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Oligonucleotide Sequence (5' -3' ) 



a 


sense 


TCGAGTCGACGCGGCCGCTAA 


A 


antisense 


GGCCTTAGCGGCCGCGTCGAC 


b 


sense 


TCGAGGTCGACGCGGCCGCAGTAA 


B 


antisense 


GGCCTTACTGCGGCCGCGTCGACC 


c 


sense 


TCGAGAGTCGACGCGGCCGCTTAA 


c 


antisense 


GGCCTTAAGCGGCCGCGTCGACTC 



Table 2 

Two-hybrid vectors used for the expression of fusion proteins. 



Plasmid 



Fusion- Insert Counter- 

protein (kb) selection 



Selec- 
tion 

in yeast 



Pus ion 

protein 

Reference 



PBTM117C LexA - CAN1 

PBTM117C-HD1. 6 LexA-HD1.6 1.6 CAN1 



pBTM117c-HD3.6 LexA-HD3 . 6 3.6 



pBTM117c-MJD LexA-MJD 1.1 
pBTM117c-HIPl LexA-HIPl 1.2 
PGAD427 GAL4ad 
pGAD427 -ARNT GAL4ad- 1.4 
ARNT 

pGAD427-HIPl GAL4ad- 1.2 
HIP1 

pGAD42 7 -HIPCT GAL4ad- 0.5 
HIPCT 

pGAD427-14-3-3 GAL4 ad- 14 - 1.0 
3-3 

pGAD427-LexA Gal4ad- 1.2 
LexA 



CAN1 



pBTM117c-SIMl LexA-SIMl 1.1 CAN1 



CAN1 
CAN1 
CYH2 
CYH2 

CYH2 

CYH2 

CYH2 

CYH2 



TRP1 
TRP1 

TRP1 

TRP1 

TRP1 
TRP1 
LEU2 
LEU2 

LEU2 

LEU2 

LEU2 

LEU2 



N/A 

Wanker et 
al., 1997 
Wanker et 
al., 1997 
Probst et 
al., 1997 
this work 
this work 
N/A 

Probst et 
al., 1997 
Wanker et 
al., 1997 
Wanker et 
al., 1997 
this work 

this work 
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Table 3 

Yeast strains used for the 5-FOA counterselection and the 
automated interaction mating 



Strain 


Plasmids 


Selected on 


xla 


pBTM117c / pLUA 


SD-trp-ade 


x2a 


pBTM117c-SIMl / pLUA 


SD-trp-ade 


x3a 


pBTM117c-HIPl / pLUA 


SD-trp-ade 


yla 


PGAD427 / pLUA 


SD-leu-ade 


y2a 


pGAD4 2 7 - ARNT / pLUA 


SD-leu-ade 


y3a 


pGAD427-LexA / pLUA 


SD-leu-ade 
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Table 4 

Identification of fusion proteins that activate the URA3 
readout system, 
a . 



Strain 


Plasmids 




SD-trp 
-ade 


SD-trp SD-trp 
-ade+5- -ade- 
FOA ura 


xla 


pBTM117c 
pLUA 


/ 


+ 


+ 


x2a 


pBTM117c- 
/ pLUA 


-SIM1 


+ 


+ 


x3a 


pBTM117c- 
/ pLUA 


•HIP1 


+ 


+ 



SD-trp-ade: Selective medium lacking tryptophan and adenine. 
SD-trp-ade+5-F0A: Selective medium containing 0.2 % 5-FOA. 
SD-trp-ade-ura: Selective medium lacking tryptophan, adenine 
and uracil. 



b. 



Strain 


Plasmids 


SD-leu 
-ade 


SD-leu SD-leu 
- ade+ 5 - FOA - ade -u ra 


yla 


pGAD4 2 7 / pLUA 


+ 


+ 


y2a 


PGAD427 


+ 


+ 




-ARNT/pLUA 






y3a 


PGAD427 


+ 


+ 




-LexA/pLUA 







SD-leu-ade: Selective medium lacking leucin and adenine. 
SD-leu-ade+5-FOA: Selective medium containing 0.2 % 5-FOA. 
SD-leu-ade-ura: Selective medium lacking leucin, adenine and 
uracil . 
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Table 5 

Identification of fusion proteins that activate the LacZ 
readout system. 

A. L4 Occu yeast cells transformed with pBTM117c plasmid 
constructs expressing a fusion protein comprising the LexA DNA 
binding domain are plated on minimal medium lacking trptophan, 
buffered to pH 7.0 with potassium phosphate and containing 2 
ug/ml of X-Gal (SD-trp/XGAL) : Results for the state of the 
readout system for various auto-activating and non-auto- 
activating fusion proteins 



Plasmid 


Fusion 


Growth on 


Blue 


Construct 


protein 


SD-trp/XGAL 


colouration 


pBTM117c 


LexA 


+ 




pBTM117c-HDl. 6 


LexA-HDl . 6 






pBTM117c-HD3. 6 


LexA-HD3 • 6 


+ 




pBTM117c-SIMl 


LexA-SIMl 


+ 




pBTM117c-MJD 


LexA-MJD 






pBTM117c-HIPl 


LexA-HIPl 


+ 





B. L40ccuot yeast cells transformed with pGAD427 plasmid 
constructs expressing a fusion protein comprising the GAL4ad 
activation domain are plated on minimal medium lacking 
leucine, buffered to pH 7.0 with potassium phosphate and 
containing 2 ug/ml of X-Gal (SD-leu/XGAL) : Results for the 
state of the readout system for various auto-activating and 
non-auto-activating fusion proteins. 



Plasmid 


Fusion 


Growth on 


Blue 


Construct 


protein 


SD-leu/XGAL 


colouration 


pGAD427 


GAL4ad 


+ 




PGAD427-ARNT 


GAL4ad-ARNT 


+ 




PGAD427-HIP1 


GAL4ad-HIPl 


+ 




PGAD427-HIPCT 


GAL4ad-HIPCT 


+ 




PGAD427-14-3-3 


GAL4ad-14-3-3 


+ 




PGAD427-LexA 


Gal4ad-LexA 


+ 


+ 
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CLAIMS 



1. A method for the identification of at least one member 
of a pair or complex of interacting molecules, 
comprising: 

(a) providing host cells containing at least two 
genetic elements with different selectable and 
counter- selectable markers, said genetic. elements 
each comprising genetic information specifying one 
of said members, said host cells further carrying a 
readout system that is activated upon the 
interaction of said molecules; 

(b) allowing at least one interaction, if any, to 
occur; 

(c) selecting for said interaction by transferring 
progeny of said host cells to: 

(ca) at least two different selective media, wherein 
each of said selective media allows growth of said 
host cells only in the absence of at least one of 
said counter- selectable markers and in the presence 
of a selectable marker; and 

(cb) a further selective medium that allows 
identification of said host cells only on the 
activation of said readout system; 

(d) identifying host cells containing interacting 
molecules that: 

(da) do not activate said readout system on any of said 
selective media specified in (ca) ; and 

(db) activate the readout system on said selective 
medium specified in (cb) ; and 

(e) identifying at least one member of said pair or 
complex of interacting molecules 
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2 . The method of claim 1 wherein said pair or complex of 
interacting molecules is selected from the group 
consisting of RNA-RNA, RNA-DNA, RNA-protein, DNA-DNA, 
DNA-protein, protein-protein, protein-peptide, or 
peptide-peptide interactions. 

3. The method of claim 1 or 2 wherein said genetic elements 
are plasmids artificial chromosomes, viruses or other 
extra chromosomal elements . 

4 . The method of any one of claims 1 to 3 wherein said 
interactions lead to the formation of a functional 
transcriptional activator comprising a DNA-binding and a 
transactivating protein domain and which is capable of 
activating a responsive moiety driving the activation of 
said readout system. 

5 . The method of claim 4 wherein said readout system is a 
detectable protein. 

6 . The method of claim 5 wherein said detectable protein is 
encoded from at least one of the genes lacZ, HIS3, URA3, 
LYS2, sacB, tetA, gfp or HRPT. 

7 . The method of any one of claims 1 to 6 wherein said host 
cells are yeast cells, bacterial cells, mammalian cells, 
insect cells or plant cells. 

8. The method of any one of claims 1 to 7 further 
comprising transforming or transfecting said host cells 
with said genetic elements prior to step (a) . 

9 . The method of any one of claims l to 8 wherein cell 
fusion, conjugation or interaction mating is used for 
the generation of said host cells with said genetic 
elements prior to step (a) . 
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10. The method of any one of claims 1 to 9 wherein said 
counter- selectable markers selected against in step (ca) 
are selected from the group of CAN1, CYH2 , LYS2 , URA3, 
lacy, rpsL HPRT and sacB. 

11. The method of any one of claims 1 to 10 wherein said 
selectable marker is an auxotrophic or antibiotic 
marker. 

12 . The method of claim 11 wherein said auxotrophic or 
antibiotic marker is LEU2 , TRP1, URA3, ADE2, HIS3, LYS2 
or Zeocin. 

13 . The method of any one of claims 1 to 12 wherein progeny 
of host cells of step (b) are transferred to storage 
compartment . 

14. The method of claim 13 wherein said transfer is effected 
or assisted by automation or a picking robot. 

15 . The method of claim 13 or 14 wherein said storage 
compartment comprises an anti-freeze agent. 

16. The method of any one of claims 3 to 15 wherein said 
storage compartment is a microtiter plate. 

17. The method of claim 16 wherein said microtiter plate 
comprises 384 wells. 

18. The method of any one of claims 1 to 17 wherein said 
transfer in step (c) is made or assisted by automation, 
a spotting robot, pipetting or micropipetting device. 

19. The method of claim 18 wherein said transfer is made to 
a planar carrier. 
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20. The method of claim 18 or 19 wherein said transfer is in 
a regular grid pattern of densities of 1 to 1000 clones 
per cm 2 . 

21. The method of any one of claims 18 to 20 wherein said 
planar carrier is a membrane. 

22. The method of any one of claims 1 to 21 wherein said 
identification of said host cells in step (d) is 
effected by visual means from consideration of the 
activation state of said readout system. 

23. The method of any one of claims 1 to 22 wherein said 
identification of said host cells in step (d) is 
effected by digital image storage, analysis or 
processing. 

24. The method of any one of claims 1 to 23 wherein said 
identification of said at least one member of said pair 
of interacting molecules is effected by nucleic acid 
hybridization, antibody binding or nucleic acid 
sequencing. 

25. The method of claim 24 wherein said identification made 
by said antibody reaction or said hybridization is 
effected using regular grids of said at least one member 
or of said genetic information encoding said at least 
one member. 

26. The method of claim 25 wherein construction of said 
regular grids is effected by automation or a spotting 
robot . 

27. The method of any one of claims 24 to 26 wherein said 
identification is effected by digital image storage, 
processing or analysis. 
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28. The method of any one of claims 24 to 27 wherein nucleic 
acid molecules, prior to said identification, are 
amplified by PCR or are amplified in as a part of said 
genetic elements, preferably in bacteria and most 
preferably in E.coli. 

29. The method of any one of claims 1 to 28 wherein, prior 
to step (a) a preselection against clones that express a 
single molecule able to activate the readout system is 
carried out on culture media comprising a 
counterselective compound . 

30. The method of claim 29 wherein said counterselective 
compound is 5-fluoro orotic acid, canavanine, 
cycloheximide, sucrose, tONPG, streptomycin or a-amino- 
adipate. 

31. A method for the production of a pharmaceutical 
composition comprising formulating said at least one 
member of the interacting molecules identified by the 
method of any one of .claims 1 to 30 in a 
pharmaceutically acceptable form. 

32. A method for the production of a pharmaceutical 
composition comprising formulating an inhibitor of the 
interaction of the interacting molecules identified by 
the method of any one of claims 1 to 30 in a 
pharmaceutically acceptable form. 

33. A method for the production of a pharmaceutical 
composition comprising identifying a further molecule of 
a cascade of interacting molecules, of which the at 
least one member of said interacting molecules 
identified by the method of any one of claims 1 to 30 is 
a part of or identifying an inhibitor of said further 
molecule . 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



123 

34. Kit comprising at least one of the following: 

(f) host cells as identified in any of the preceding 
claims and at least one genetic element comprising 
said genetic information specifying at least one of 
said possibly interacting molecules containing a 
counterselectable marker and specified in any of 
the preceding claims; 

(g) host cells as identified in any of the preceding 
claims and at least one genetic element not 
comprising genetic information specifying at least 
one of said potentially interacting molecules 
containing a counterselectable marker and specified 
in any of the preceding claims; 

(h) at least one genetic element comprising said 
genetic information specifying at least one of said 
potentially interacting molecules containing a 
counterselectable marker and specified in any of 
the preceding claims; 

(i) at least one genetic element not comprising genetic 
information specifying at least one of said 
potentially interacting molecules containing a 
counterselectable marker and specified in any of 
the preceding claims; 

(j) host cells comprising at least one and preferably 
at least two of said genetic elements specified in 
(h) or (i) ; 

(k) at least one planar carrier carrying nucleic acid 
or protein from said host cells comprising at least 
one member of said genetic elements specified in 
any of the preceding claims wherein said nucleic 
acid or protein is affixed to said carrier in grid 
form and optionally solutions to effect 
hybridization or binding of nucleic acid probes or 
proteins to said molecules affixed to said grid; 

(1) at least one storage compartment, planar carrier or 
computer disc comprising or/and characterizing 
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genetic elements, host cells, storage compartments 
or carriers identified in any of (f ) to (k) ; and/or 
(m) at least one yeast strain comprising a canl and a 
cyh2 mutation. 

35. The kit of claim 34, wherein said host cells of (f ) , (g) 
or (j) are contained in at least one storage 
compartment . 

36. The kit of claim 34 or 35, wherein said genetic 
information or said potentially interacting molecules 
encoded by said genetic information as specified in (i) 
or (iii) are contained in at least one storage 
compartment . 

37. A computer implemented method for, storing and analysing 
data relating to potential members of at least one pair 
or complex of interacting molecules encoded by nucleic 
acids originating from biological samples, said methods 
comprising 

(n) retrieving from a first data- table information for 
a first nucleic acid, wherein said information 
comprises : 

(oa) a first combination of letters and/or numbers 
uniquely identifying the nucleic acid, and 

(ob) the type of genetic element comprising said nucleic 
acid and 

(oc) a second combination of letters and/or numbers 

uniquely identifying a clone in which a potential 
member encoded by said nucleic acid was tested for 
interaction with at least one other potential 
member of a pair or complex of interacting 
molecules 

(p) using said second combination of letters and/or 

numbers to retrieve from said first data-table or 
optionally further data- tables, information 
identifying additional nucleic acids encoding for 
said at least one other potential member in step 
a3) . 
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38. The method of claim 37 further comprising, using said 
second combination of letters and/or numbers in step 
(oc) to retrieve from a second data- table further 
information, where said further information at least 
comprises the interaction class of said clone, and 
optionally additional information comprising, 

(q) the physical location of the clone, 

(r) predetermined experimental details pertaining to 

creation of said clone, including at least one of, 
(ra) tissue, disease-state or cell source of the nucleic 

acid, 

(rb) cloning details, and 

(rc) membership of a library of other clones, 

39. The method of claim 38 further comprising, using said 
information of step (o) on said first and/or of step (p) 
on additional nucleic acids to relate to a third data- 
table further characterising said first and/or 
additional nucleic acids, where said further 
characterising comprises at least one of 

(s) hybridization data; 

(t) oligonucleotide fingerprint data; 

(u) nucleotide sequence; 

(v) in- frame translation of the said nucleic acids; 
(w) tissue, disease-state or cell source gene 
expression data; and 

optionally identifying the protein domain encoded by 
said first or additional nucleic acids. 

40. The method of claim 39 further comprising, identifying 
if said potential members encoded by the nucleic acids 
interact, by considering said interaction class of said 
clone in which nucleic acids were tested for said 
interaction in step (oc) 



SUBSTITUTE SHEET (RULE 26) 



WO 99/28744 



PCT/EP98/07656 



126 

41. The method of one of claims 37 to 40, wherein said data 
relates to 10 to 100 potential members, preferably 100 
to 1000 potential members., more preferably 1000 to 
10000 potential members and most preferably more than 
10,000 potential members. 

42. The method of one of claims 37 to 41, wherein said data 
was generated by the method of claims 1 to 36. 

43. The method of claims 38 to 42, wherein said interaction 
class comprises one of the following: 

(x) Positive 
(y) Negative 
(z) False Positive 

44. The method of one of claims 40 to 43 wherein sticky 
proteins are identified by consideration of the number 
of occurrences a given member is identified to interact 
with many different members in different clones of said 
positive interaction class. 

45. The method of one of claims 37 to 44, wherein said first 
data- table forms part of a first database, and said 
second and third data tables form part of at least a 
second database . 

46. The method of claim 45, wherein said second database is 
held on a computer readable memory separate from the 
computer readable memory holding said first database, 
and said database is accessed via a data exchange 
network. 

47. The method of claim 46, wherein said second database 
comprises nucleic acid or protein sequence, secondary or 
tertiary structure, biochemical, biographical or gene 
expression information. 
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48. The method of claims 37 to 47, wherein data entry to 
said first, second or further data tables is controlled 
automatically from said first data base by access to 
other computer data, programs or computer controlled 
robots . 

49. The method of one of claims 37 to 48, wherein at least 
one workflow management system is built around 
particular sets of data to assist in the progress of the 
method of claims 1 to 36. 

50. The method of claim 49, wherein said workflow management 
system is software to assist in the progress of the 
identification of members of a pair or complex of 
interacting molecules using the method of hybridization 
as specified in claims 24 to 28 

51. The method of claims 37 to 50, wherein said data are 
investigated by queries of interest to an investigator. 

52. The method of claim 51, wherein said queries include at 
least one of, 

(aa) identifying the interaction or interaction pathway 
between a first and second member of an interaction 
network 

(ab) identifying the interaction pathway between a first 
and second member of an interaction network and 
through at least one further member, 

(ac) identifying the interaction or interaction pathway 
between at least two members characterised by 
nucleotide acid or protein sequences, secondary or 
tertiary structures, and 

(ad) identifying interactions or interaction pathways 
that are different for said different tissue, 
disease-state or cell source. 

53. The method of claims 51 or 52, wherein parts of said 
information is stored in a controlled format to assist 
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data query procedures . 

54. The method of claims 51 to 53, wherein the results of 
said queries are displayed to the investigator in a 
graphical manner. 

55. The method of claims 54, wherein a sub- set of data 
comprising data characterising nucleic acids identified 
as encoding members of a pair or complex of interacting 
molecules of claim 40 is stored in a further data- table 
or data base. 

56. The method of claim 55 wherein consideration of the 
number of occurrences a given member is identified to 
interact with a second or further member is used to 
decide if said data characterising nucleic acids form 
part of said sub-set of data. 

57. The method of claims 55 or 56, wherein additional 
information or experimental data is used to select those 
data to form part of said subset . 

58. The method of claims 55 to 57, wherein to speed certain 
data query procedures, the structure in which the data 
is stored in the computer readable memory is modified. 

59. The method of one of claims 37 to 58, wherein the data 
is held in relational or object oriented data bases. 

60. A data storage scheme comprising a data table that holds 
information on each member of an interaction, where a 
record in said table represents each member of an 
interaction, and in which members are indicated to form 
interactions by sharing a common name. 

61. The data storage scheme of claim 60, wherein said common 
name is a clone name or unique combination of letters 
and/or numbers comprising said clone name. 
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